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Abstract 

In these notes we give an introduction botli to Monstrous Moonshine and to the classifica- 
tion of rational conformal field theories, using this as an excuse to explore several related 
structures and go on a little tour of modern math. We will discuss Lie algebras, modular 
functions, the finite simple group classification, vertex operator algebras, Fermat's Last 
'^ \ Theorem, category theory, (generalised) Kac-Moody algebras, denominator identities, the 

A-D-E meta-pattern, representations of affine algebras, Galois theory, etc. This work is 
informal and pedagogical, and aimed mostly at grad students in math or math phys, but 
^ ! I hope that many interested nonexperts will find something of value here — like any good 

Q««^ \ Walt Disney movie I try not to completely ignore the 'grown-ups'. My emphasis is on 

0> \ ideas and motivations, so these notes are intended to complement other papers and books 

^ ■ where this material is presented with more technical detail. The level of difficulty varies 

c^ . significantly from topic to topic. The two parts — in fact any of the sections — can be 

2 ! read independently. 
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Part 1. The classification of conformal field theories 

1.1. Informal motivation 

In this section we will sketch a very informal and 'hand-wavy' motivation to what we 
shall caU the classification problem for rational conformal field theory (RCFT). Much of 
this material is more carefully treated in e.g. [18]. 

A CFT is a quantum field theory (QFT), usually with a two-dimensional space-time, 
whose symmetries include the conformal transformations. There are different approaches to 
CFT — for one of these see [26,27]. Another formulation which has been deeply infiuential 
is due to Graeme Segal [52] . It is motivated by string theory and is phrased in an important 
mathematical language called category theory. 

A category consists of two types of things. One are called objects, and the other are 
called arrows (or morphisms). An arrow, written f : A -^ B, has an initial and a final 
object {A and B respectively). Arrows f,g can be composed to yield a new arrow fog, 
if the final object of g equals the initial object of /. Maps between categories are called 
functors if they take the objects (resp., arrows) of one to the objects (resp., arrows) of the 
other, and preserve composition. 

The only difficulty people can have in understanding categories is in realising that 
there is no real content to them. It's just a language, highly abstract like the more familiar 
set theory, but in many contexts (a great example is the theory of knot invariants [58]) one 
which is both natural and suggestive. It tries to defiect some of our instinctive infatuation 
with objects (nouns), to the mathematically more fruitful one with structure- preserving 
maps between objects (verbs). A gentle introduction to the mathematics and philosophy 
of categories is [43]; we'll give a taste of this shortly. 

The standard example of a category is called Set, where the 'objects' are sets, and 
the arrows from A to B are functions A ^ B. A related example that Segal uses is Vect, 
where the objects are complex vector spaces and the arrows are linear maps. A rather 
trivial example of a functor JF :Vect— >Set sends a vector space V to its underlying set, 
also called V — i.e. JF simply 'forgets' the vector space structure on V and ignores the 
fact that the arrows / in Vect are linear. The other category Segal uses he calls C; its 
objects are disjoint unions of (parametrised) circles S^, and the arrows are (conformal 
equivalence classes of) cobordisms, i.e. (Riemann) surfaces whose boundaries are those 
circles. Composition of arrows in C is accomplished by gluing the surfaces along the 
appropriate boundary circles. 

Consider the usual definition of a one-to-one function: f{x) = f{y) only when x = y. 
Category theory replaces this with the following. The arrow f : A ^ B is called 'monic' if 
for any arrow g : C ^ B, there exists a unique arrow h : C ^ A such that f o h = g. So 
it's a sort of factorisation property. You can easily verify that in Set the notions of 'one- 
to-one' and 'monic' coincide. What does this redefinition gain us? It certainly doesn't 
seem any simpler. But it does change the focus from the argument of /, to the global 
functional behaviour of /, and a change of perspective can never be bad. And it allows 
us to transport the idea of one-to-one-ness to arbitrary categories. For instance, in the 
Riemann surface category C, the 'one-to-one functions' are the genus-0 cobordisms. 
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Or consider the notion of product. In category theory, we say that the triple (P, a, b) 
is a product of objects A, B if a : P ^ A and b : P ^ B are arrows, and if for any 
f : C ^ A, g : C —^ B, there is a unique arrow h : C ^ P such that f = aoh and g = boh. 
This notion unifies several constructions (each of which is the 'product' in an appropriately 
chosen category): Cartesian product of sets; intersection of sets; multiplication of numbers; 
the logical operator 'and'; direct product; infimum in a partially ordered set; etc. Sum 
can be defined similarly, unifying the constructions of disjoint union, 'or', addition, tensor 
product, direct sum, supremum, etc. 

This generality of course comes with a price: it can wash away all of the endearing 
special features of a favourite theory or structure. There certainly are contexts where e.g. 
all human beings should be thought of as equal, but there are other contexts where the 
given human is none other than your mother and must be treated as such. It turns out that 
category theory provides a beautiful framework for understanding topological invariants 
such as the Jones- Reshetikhin-Turaev-Witten knot invariants (see e.g. [58]). And it seems 
to be a natural language for formulating CFT axiomatically, as we'll now see. 

According to Segal, a CFT is a functor T from C to Vect, which obeys various 
properties. The picture comes from string theory: the fundamental object is a 'vibrating' 
loop; a state is given by a collection of these loops; each classical Feynman path from the 
initial to the final states is a world-sheet, i.e. a surface E whose boundary is all the loops 
in the initial and final states. QFT assigns a complex number / (the action) to each of 
these world-sheets, and the quantum amplitude, written (final | initial), will then be the 
integral over all worldsheets of e'^' ^. (The quantum amplitude is how the theory makes 
contact with experiment, as it tells us the probability of the given process | initial) h-» | final) 
happening.) This is what Segal is trying to capture formally. The vector spaces in Vect 
come in in order to handle uniformly and simultaneously the various 'vibrations' of the 
strings. In particular there is one basic vector space H (a Hilbert space of quantum states), 
and the functor will take n copies of S^ to H^ = H ^ ■ ■ ■ ^ H. 

The simplest interesting example here is the 'tree-level creation of a string from the 
vacuum'. In this case the world-sheet looks like a bowl, i.e. topologically is a disk D (if 
we imagine the bowl to be made of rubber, we could grab its rim and stretch it down 
flat onto the table, so we say a bowl and a disk are topologically equivalent — see also 
§2.3). Segal's functor gives us a linear map T{D) : C ^ H {H^ is just C), which we can 
think of equivalently as the assignment of the vector T{D){1) in H to D. In the case of the 
standard unit disk (i.e. where the parametrisation of the boundary S^ is simply 9 \-^ e^'^ ), 
this vector is calOled the vacuum state O = |0). 

For another example, consider the 'vacuum-to- vacuum expectation value'. The initial 
and final states (objects) here are both the empty set, so the world-sheets (arrows) are 
closed Riemann surfaces. As usual in QFT, we can organise these by how many internal 
'loops' are involved (this number is called the genus of the surface): topologically, 0-loop 
(i.e. 'tree-level') world-sheets are spheres, 1-loop world-sheets are tori, etc. These closed 
Riemann surfaces are discussed in more detail in §2.3. The 0-loop contribution isn't very 
interesting (there is only one conformal equivalence class of spheres) , so let us look at the 
1-loop contribution. It will be of the form J Z{[torus]) d[torus], where Z is a, complex- 
valued function called the partition function, and [torus] is a conformal equivalence class 



of tori. In the Segal formalism we recover this in the following way: the functor takes 
[torus] to a linear function from H^ = C to H^ = C. Any such linear function is simply a 
1x1 matrix, i.e. a complex number, which we call Z{[torus]). 

Now there is a nice parametrisation of conformal equivalence classes of tori, as we will 
see more explicitly in §2.3. Namely, a representative for each class can be chosen to be 
of the form C/(Z + Zr) where r is in the upper half plane Ti. Thus we can write ^ as a 
function of a complex variable r. However, different r correspond to the same equivalence 
class of tori: the redundancy is exactly captured by the modular group PSL2(Z). Namely, 
r and ff^ are equivalent, whenever a, 6, c, (i G Z and ad — bc = 1. Thus Z{t) = Z{^^^). 
In other words, the partition function is modular invariant!^ 

There are two sectors in CFT, a holomorphic one and an antiholomorphic one, cor- 
responding to the two directions ('left-' and 'right- moving') of motion on a string, or the 
two components of the group Diff(S'^) of diffeomorphisms of the circle. This means that 
many of the quantities (e.g. the partition function) factorise into parts depending holo- 
morphically and anti-holomorphically on the modular parameters (e.g. r in genus 1). In a 
rational CFT there are finitely many 'primary fields' a & ^ — the precise meaning of this 
is not important here, but it says that the space of states for the theory decomposes into 
a finite sum^ H = (Ba,be^^abHa ® Hb, where Mat are nonnegative integers which count 
the multiphcity of Ha (S)Hb in H. The linear maps T{T.) : H"^ ^ H"" in an RCFT will 
factorise similarly; this 'chiral factorisation' is captured by what Segal calls the 'modular 
functor' [52]. The partition function becomes 

Z{t)= J2 MaBXa{r)xb{rr (1.1.1) 

a,be^ 

for certain holomorphic functions Xa- One of the primary fields (we'll denote it '0') corre- 
sponds to the vacuum $7, and uniqueness of the vacuum means that Mqq = 1. 

Hq is called a chiral algebra; in the language of §2.6, Hq will be a vertex operator 
algebra (VOA). $ parametrises the irreducible iifo-niodules and the x's are their characters; 
in an RCFT we require this number to be finite. For example, for the Moonshine VOA V^ 
discussed in Part 2, $ consists of only one element. 

The higher-genus behaviour of an RCFT is determined from the lower-genus be- 
haviour, by composition of 'arrows' (i.e. the gluing together of surfaces) in C. See Figure 
3 of [30] to find how a genus-2 surface is built up from genus-0 ones. In fact, it's generally 
believed that an RCFT will be uniquely determined by: (i) the choice of chiral algebra; (ii) 
the partition function (which tells you the spectrum of the theory, i.e. how the two sectors 
link up); and (iii) the structure constants C^^, which in the Segal formalism correspond to 
the surfaces called 'pairs-of-pants', equivalently disks with two interior disks removed. Our 
approach will be to start with a chiral algebra, and find all possible partition functions. 
We will thus ignore the important question of existence and uniqueness of the structure 



In higher-dimensional string theories, a similar argument shows more generally that automorphic forms 

will appear naturally. 

2 

It seems though that 'rational' logarithmic CFT is trying to teach us the lesson that this familiar require- 
ment can and should be weakened. See Gaberdiel-Kausch (1999). 



constants, though at least for our chiral algebras, it seems to be generally believed that 
the structure constants will be unique. 

Perhaps all chiral algebras come from standard constructions (e.g. orbifolds and the 
Goddard-Kent-Olive (GKO) coset construction — see e.g. [18]) involving lattices and affine 
Kac-Moody algebras. For instance a Z2-orbifold of the VOA of the Leech lattice gives us the 
Moonshine module V\ and the so-called minimal models come from GKO cosets involving 
Al . This is in line with the spirit of Tannaka-Krein duality (and its generalisations by 
Deligne and Doplicher- Roberts) , which roughly says that if a bunch of things act like 
they're the set of representations of a Lie group, then they are the set of representations 
of a Lie group. 

In any case, one of the simplest, best understood, and important classes (called Wess- 
Zumino-Witten (WZW) models — see for instance [30,59] in this volume) of RCFTs cor- 
respond to affine Kac-Moody algebras at a positive integer level k. We will have much 
more to say later about these algebras, but for now let us remark that $ here will be 
the (finite) set P^ of integrable level k highest weights A. Their chiral algebras were con- 
structed by Frenkel and Zhu. The following sections concern the attempt to classify the 
partition functions corresponding to Kac-Moody algebras — see especially §1.5. I will use 
this theme as an excuse to describe many other things, e.g. the A-D-E meta-pattern. Lie 
theory, Galois, fusion rings, ... I dedicate these notes to the profound friendship developing 
in recent years between mathematics and physics. As Victor Kac said in his 1996 Wigner 
medal acceptance speech, "Some of the best ideas come to my field from the physicists. 
And on top of this they award me a medal. One couldn't hope for a better deal." 



1.2. Lie algebras 

Lie algebras (and their nonlinear partners Lie groups) appear in numerous places 
throughout math and mathematical physics. A nice introduction is [9]; Lie theory is 
presented with more of a physics fiavour in [24] , as well as [59] . 

An algebra is a vector space with a way to multiply vectors which is compatible with 
the vector space structure (i.e. the vector-valued product is bilinear: {au+a'u') {bv+b'v') = 
abuv + ab' uv' + a'bu'v + a'b' u'v'). For example, the complex numbers C can be thought 
of as a 2-dimensional algebra over M (a basis is 1 and i = V— 1; the scalars here are real 
numbers and the vectors are complex numbers). The quaternions are 4-dimensional over 
M and the octonions are 8-dimensional over M. Incidentally, these are the only finite- 
dimensional algebras over M which obey the cancellation law: -u ^ and uv = implies 
V = (the reader should try to convince himself why the familiar vector product on M^ fails 
the cancellation law). This important little fact makes several unexpected appearances in 
math. For instance, it is trivially possible to 'comb the hair' on the circle S^ without 
'cheating' (i.e. needing a hair-part or exploiting a bald spot): just comb the hair clockwise 
for example. However it is not possible to comb the hair on the sphere S'^ (e.g. your 
own head) without cheating. The only other /c-spheres S^ which can be combed (i.e. for 
which there exist k linearly independent continuous vector fields) are k = 3 and 7. This 
is intimately connected with the existence of C, the quaternions, and octonions (namely. 



S^j S^, 5"^ can be thought of as the length 1 complex numbers, quaternions, and octonions, 
resp.). 

In a Lie algebra g, the product is usually called a 'bracket' and is written [xy]. It is 
required to be 'anti-commutative' and 'anti-associative': 

[xy] + [yx]=0 (1.2.1a) 

[x[yz]] + [y[zx]] + [z[xy]]=0 (1.2.16) 

(like most other equalities in math, (1.2.1b) is usually called the Jacobi identity). Usually 
we will consider Lie algebras over C, but sometimes over M. Note that (1.2.1a) says 
[xx] = 0. 

One important consequence of bilinearity is that it is enough to know the values of all 
the brackets [xiXj] for i < j, for any basis {xi, X2, . . .} of g. (The reader should convince 
himself of this before proceeding.) 

The simplest example of a Lie algebra is g = C (or g = M) , with the bracket [xy] iden- 
tically 0. In fact, this is the only 1-dimensional Lie algebra. It is a straightforward exercise 
for the reader to find all 2- and 3-dimensional Lie algebras (over C) up to isomorphism 
(i.e. change of basis): there are precisely 2 and 6 of them, respectively (though one of the 
6 depends on a complex parameter) . Over M, there are 2 and 9 (with 2 depending on real 
parameters) . This exercise cannot be continued much further — e.g. not all 7-dimensional 
Lie algebras (over C) are known. Nor is it obvious that this would be an interesting or 
valuable exercise. We should suspect that our definition of Lie algebra is probably a little 
too general for anything obeying it to be automatically an interesting structure. More 
often than not, a classification turns out to be a stale and useless list. 

Two of the 3-dimensional Lie algebras are important in what follows. One of them is 
well-known to the reader: consider the vector-product (also called cross-product) in C'^. 
Taking the standard basis {ei, 62, 63} of C"^, the bracket can be defined by the relations 

[6162] = 63 , [eics] = -62 , [6263] = ei . (1.2.2a) 

This Lie algebra, denoted Ai or sl2(C), can be called the 'mother of all (semi-simple) Lie 
algebras'. A more familiar realisation of Ai uses a basis {e, /, h} with relations 

[e/]=/i, [/ie]=2e, [hf] = -2f . (1.2.26) 

The reader can find the change-of-basis (valid over C but not M) showing that (1.2.2) define 
isomorphic complex (but not real) Lie algebras. 

Another important 3-dimensional Lie algebra is called the Heisenberg algebra"^ and 
is the algebra of the canonical commutation relations in quantum mechanics: choosing a 
basis x,p, h, it is defined by 

[xp] = h , [xh] = [ph] = . (1.2.3) 
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limension. 



From our definition, it is far from clear that Lie algebras, as a class, should be natural 
and worth studying. After all, there are infinitely many possible axiomatic systems: why 
should the one defining a Lie algebra be anything special a priori! Perhaps this could have 
been anticipated by the following line of reasoning. 

Axiom. Groups are important and interesting. 
Axiom. Manifolds are important and interesting. 

Manifolds are structures where calculus is possible; locally, a manifold looks like a 
piece of MJ^ (or C"), but these pieces can be bent and stitched together to create more 
interesting shapes. For instance a circle is a 1-dimensional manifold, while Einstein claimed 
space-time is a curved 4-dimensional one. 

Definition. A Lie group is a manifold with a compatible group structure. 

This means that 'multiplication' and 'inverse' are differentiable maps. M is a Lie 
group, under addition: obviously, n : M? ^ M. and i : M ^ M defined by /u(a, b) = a + b 
and t(a) = —a are both differentiable. (Why isn't M a Lie group under multiplication?) A 
circle is also a Lie group: parametrise the points with the angle 9 defined mod 2n (or mod 
360 if you prefer); the 'product' of the point at angle 9i with the point at angle 6*2 will 
be the point at angle 9i + 6*2. Surprisingly, the only other /c-sphere which is a Lie group 
is 5"^ (the product can be defined using quaternions of unit length^, or using the matrix 
group SU2(C)). Many but not all Lie groups can be expressed as matrix groups. Two 
other examples are GL„ (invertible n x n matrices) and SL^ (ones with determinant 1). 

A consequence of the above axioms is then surely: 

Corollary. Lie groups should be important and interesting. 

Lie group structure theory can be thought of as a major generalisation of linear alge- 
bra. The basic constructions familiar to undergraduates have important analogues valid 
in many Lie groups. For instance, years ago we were taught to solve linear equations 
and invert matrices by using elementary row operations to reduce a matrix to row-echelon 
form. What this says is that any matrix A G GL„(C) can be factorised A = BPN, where 
A^ is uppertriangular with I's on the diagonal, P is a permutation matrix, and B is an 
uppertriangular matrix. This is essentially what is called the Bruhat decomposition of the 
Lie group GLn{C). More generally (where it applies to any 'reductive' Lie group G), P 
will be an element of the so-called 'Weyl group' of G (of which we'll have much more to 
say later), and B will be in a 'Borel subgroup'. 

Lie groups appear throughout physics. E.g. the orthogonal group S03(R) is the con- 
figuration space of a rigid body centred at the origin, while SU2(C) is the set of states 
of an electron at rest. The gauge group of the Standard Model of particle physics is 
SU3(C) X SU2(C) X Ui(C), while the Lorentz group of special relativity is S03^i(M). 

There is an important relation between Lie groups and Lie algebras. 

Fact. The tangent space of a Lie group is a Lie algebra. Any (finite-dimensional real or 
complex) Lie algebra is the tangent space to some Lie group. 



Similarly, the 7-sphere inherits from the octonions a nonassociative (hence nongroup) product, compatible 
with its manifold structure. 



More precisely, the tangent space at 1 (i.e. the set of all infinitesimal generators of the 
Lie group) can be given a natural Lie algebra structure. A Lie algebra, being a linearised Lie 
group, is much simpler and easier to handle. The Lie algebra preserves the local properties 
of the Lie group, though it loses global topological properties (like boundedness) . A Lie 
group has a single Lie algebra, but a Lie algebra can correspond to many different Lie 
groups. The Lie algebra corresponding to both R and 5"^ is g = M with trivial bracket. 
The Lie algebra corresponding to both S^ = SU2(C) and S03(]R) is the cross-product 
algebra on M^ (usually called so3(]R)). Given the above fact, a safe guess would be: 

Conjecture. Lie algebras are important and interesting. 

From this line of reasoning, it should be expected that historically Lie groups arose 
first. Indeed that is the case: the Norwegian Sophus Lie introduced them in 1873 to try to 
develop a Galois theory for ordinary differential equations. As the reader may be aware, 
Galois theory is used for instance to show that not all 5th degree (or higher) polynomials 
can be explicitly 'solved' using radicals — we will meet Galois theory in §1.8. Lie wanted 
to study the explicit solvability (integrability) of differential equations, and this led him to 
develop what we now call Lie theory. The importance of Lie groups however have grown 
well beyond this initial motivation. 

An important class of Lie algebras are the so-called finite-dimensional simple ones. 
Their definition and motivation will be studied in §2.7 below, but in a certain sense they 
serve as building blocks for all other finite-dimensional Lie algebras. 

The classification of simple finite-dimensional Lie algebras over C is quite important 
and was accomplished at the turn of the century by Killing and Cartan. There are 4 
infinite families A^ (i > 1), Bi (^ > 3), Ci {£ > 2), and Di (^ > 4), and 5 exceptionals Eq, 
Ej, Eg, F4 and 6*2. Ai can be thought of^ as sl^_)_i(C), the (£+1) x {£ + 1) matrices with 
trace 0. The orthogonal algebras Bg and Di can be identified with so2^+i(C) and so2^(C), 
resp., where sOti(C) is all n x n anti-symmetric matrices A* = —A. The symplectic algebra 

Ci is sp2^(C), i.e. all 2i x 2i matrices A obeying AO = — OA*, where 0=1 I and 



Ig is the identity matrix. The exceptionals can be constructed e.g. using the octonions. In 
all these cases the bracket is given by the commutator 

[AB] = [A, B] := AB - BA (1.2.4) 

(it is a good exercise for the reader to confirm that the commutator satisfies (1.2.1), and 
that e.g. sl„(C) is indeed closed under it). To see that (1.2.2b) truly is sl2(C), put 

Incidentally the names A, B, C, D have no significance: since the 4 series start at £ = 
1,2,3,4, it seemed natural to call these A, B,C, D, resp. Unfortunately a bit of bad 
luck happened: B2 and C2 are isomorphic and so at random that algebra was placed in 
the orthogonal series; however affine Dynkin diagrams make it clear that it really is a 
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symplectic algebra which accidentaUy looks orthogonal; hence in hindsight the names of 
the B- and C-series really should have been switched. 

This classification changes if the field — the choice of scalars=nunibers — is changed. 
By a field, we mean we can add, subtract, multiply and divide, such that all the usual 
properties like commutativity and distributivity are obeyed. Fields will make a few differ- 
ent appearances in these notes. C, M, and Q are fields, while Z is not (you can't always 
divide an integer by e.g. 3, and remain in Z). The integers mod n, which we will write Z„, 
are a field iff n is prime (the reader can verify that in e.g. Z4, it is not possible to divide 
by the field element [2] G Z4 even though [2] 7^ [0] there) . C and M are examples of fields 
of characteristic — this means that is the only integer k with the property that kx = 
for all X in the field. Zp is the simplest example of a field with nonzero characteristic: in 
Zp, multiplying by the integer p has the same effect as multiplying by 0, and so we say Zp 
has characteristic p. Strange fields have important applications in e.g. coding theory and, 
ironically, in number theory itself — see e.g. §1.8. 

As always, C is better behaved than e.g. R because every polynomial can be factorised 
over C (we say C is algebraically closed) — this implies for example that every matrix has 
an eigenvector over C but not necessarily over M. Over M, the difference in the simple 
Lie algebra classification is that each symbol Xg G {Ag, ...,6^2} corresponds to a number 
of inequivalent algebras (over C, each algebra has its own symbol). For example, 'Ai' 
corresponds to 3 different real simple Lie algebras, namely the matrix algebras sl2(M), 
sl2(C) (interpreted as a real vector space), and so3(]R). The simple Lie algebra classification 
has recently been done in any characteristic p > 7. It is surprising but very common that 
the smaller primes behave very poorly, and the classification for characteristic 2 is probably 
completely hopeless. 

Associated with each simple algebra Xi is a Weyl group, and a (Coxeter-)Dynkin 
diagram. The Weyl group is a finite refiection group, e.g. for Ag it is the symmetric group 
&e+i. See Figure 7 of [59] for the Weyl group of A2. The Dynkin diagram of Xg (see e.g. 
[24,36,38] or Figure 6 in [59]) is a graph with i nodes, and with possibly some double and 
triple edges. It says how to construct Xi abstractly using generators and relations — see 
§2.7. We will keep meeting both throughout these notes. 

Another source of Lie algebras are the vector fields Vect(M) on a manifold M. A 
vector field t; is a choice (in a smooth way) of a tangent vector v{p) G TpM at each point 
of M. It can be thought of as a (1st order) differential operator, acting on functions 
/ : M — > R (or / : M — i> C); at each point p E M take the directional derivative of / in 
the direction v{p). For example the vector fields on the circle, Vect(S'^), can be thought 
of as anything of the form g{0)-^ where g{9) can be any function with period 1. We can 
compose vector fields uov, but this will result in a 2nd order differential operator: e.g. 

Instead, the natural 'product' of vector fields is given by their commutator [u^v] — uov — 
V o u, a,s it always results in a vector field: e.g. 

[/(^)^,^(^)^] = (/(^)^'(^)-/'(^)^(^))^ 



in Vect(5'^). Vect(M) with this bracket is an infinite-dimensional Lie algebra. In the 
case where M is a Lie group G, the Lie algebra of G can be interpreted as a certain 
finite-dimensional subalgebra of Vect(G) given by the 'left-invariant vector fields'. 

Simple algebras need not be finite-dimensional. An example of an infinite-dimensional 
one is the Witt algebra W, which can be defined (over C) by the basis^ Ln, n E Z, and the 
relations 

[LmLn] = (m - n)Lm+n ■ (1.2.6) 

Using the realisation Ln = — ie~^"^^, the Witt algebra can also be interpreted as the 
polynomial subalgebra of the complexification C (X> Vect(5'^) — i.e. change the scalar field 
of Vect(S'^) from M to C. Incidentally, infinite-dimensional Lie algebras need not have a 
corresponding Lie group: e.g. the real algebra Vect(5'^) is the Lie algebra of the Lie group 
Diff''~(5'^) of orientation-preserving diffeomorphisms 5"^ -^ 5"^, but C(X>Vect(5'^) has no Lie 
group. Diff''~(5'^) plays a large role in CFT, by acting on the objects of Segal's category 
C. 

The Witt algebra appears naturally in CFT: e.g. using the realisation L^ = —z"^^^ -^ it 
is the polynomial subalgebra of the Lie algebra Vect(C/{0}). Very carelessly, Vect(C/{0}) 
is often thought of as the infinitesimal conformal transformations on a suitable neighbour- 
hood of (yet clearly L_2, i^-s, ... are singular at 0!). Indeed the CFT literature is very 
sloppy when discussing the conformal group in 2-dimensions. The unfortunate fact is that, 
contrary to claims, there is no infinite- dimensional conformal group for C = M^. The best 
we can do is the 3-dimensional group PSL2(C) of Mobius transformations z ^^ ffij? which 
are orientation- preserving conformal transformations for the Riemann sphere C U {oo}. 
There seem to be 2 ways out of this rather embarrassing predicament. One is to argue 
that we are really interested in 'infinitesimal conformal invariance' in some meromorphic 
sense, so the full Witt algebra can appear. The other way is to argue that it is the confor- 
mal group of 'Minkowski space' M^'^ (or better, its compactification 5"^ x S^) rather than 
M^ = C (or its compactification S'^) which is relevant for CFT. That conformal group is 
infinite-dimensional; for 5"^ x 5"^ it consists of 2 copies of Diff''~(5'^) x Diff (5*^). For a 
more careful treatment of this point, see [51]. 

For reasons we will discuss in §1.4, we are more interested in the Virasoro algebra V 
rather than the Witt algebra W. This is a ' 1-dimensional central extension' of W; as a 
vector space V = W © CC with relations given by 

Tfl ( Tfl — 1 ) 
[L^Ln] ={m- n)Lm+n + 5n-m rr C (1.2.7a) 

[LmC]=0. (1.2.76) 

'1-dimensional central extension' means V has one extra basis vector C, which lies in the 
centre of V (i.e. [xC] = for all a; G V), and sending C ^ recovers W (i.e. takes (1.2.7a) 
to (1.2.6)). A common mistake in the physics literature is to regard C as a number: it 
is in fact a vector, though in many (but not all) representations it is mapped to a scalar 
multiple of the identity. 



In infinite dimensions, to avoid convergence complications, only finite linear combinations of basis vectors 
are generally permitted. Infinite linear combinations would involve taking some 'completion'. 



The reason for the strange- looking (1.2.7a) is that we have httle choice: V is the unique 
nontrivial l-dimensional central extension of W. The factor ^ there is conventional but 
standard, and has to do with 'zeta-function regularisation' in string theory — i.e. the 
divergent sum Xl^i ^ i^ 'reinterpreted' as C(~l) = t^? where ({s) = X^^i ^"'^ is the 
Riemann zeta function. Incidentally ({s) can be written as the product Ylpi^ ~ P~^)~^ 
over all primes p = 2, 3, 5, . . . (try to see why); hence ({s) has a lot to do with primes, in 
particular their distribution. In fact the most famous unsolved problem in math today is 
the Riemann conjecture, which says that ({s) ^ whenever Re(s) 7^ i. One researcher 
recently described this conjecture as saying that the primes have music in them. 

In CFT, Lq is the energy operator. For example the partition function is given by 
Z{t) = Tr//(g^o~^/^'^g*^o~^/^'^) and the (normahsed) character Xa equals Trj7^(g^o~'^/^'*) 
for q = e^'^^'^ . cl is the scalar multiple of the identity to which C gets sent; it has a 
physical interpretation [18] involving Casimir (vacuum) energy, which depends on space- 
time topology, and the strange shift by c/24 is due to an implicit mapping from the cylinder 
to the plane. 



1.3. Representations of finite-dimensional simple Lie algebras 



The representation theory of the simple Lie algebras^ can perhaps be regarded as an 

lin(nx) 
sin(a;) 



enormous generalisation of trigonometry. For instance the facts that ^™^?'^^'^ can be written 



as a polynomial in cos(a;) for any n G Z, and that 
sin(ma;) sin(nx) 



sin(x) 



— sin((m + n)x) + sin((m + n — 2)x) + ■ ■ ■ + sin((TO — n)x) 



for any m,n E Z>, are both easy special cases of the theory. 

The classic example of an algebraic structure are the numbers, and they prejudice us 
into thinking that commutativity and associativity are the ideal. We have learned over the 
past couple of centuries that commutativity can often be dropped without losing depth 
and usefulness, but most interesting structures seem to obey some form of associativity. 
Moreover, true associativity (as opposed to e.g. anti-associativity) really simplifies the 
arithmetic. Given the happy 'accident' that the commutator [x,y] := xy — yx in any 
associative algebra obeys anti-associativity, it would seem to be both tempting and natural 
to study the ways (if any) in which associative algebras 21 can 'model' or represent a 
given Lie algebra. Precisely, we are looking for a map p : g — ^ 21 which preserves the 
linear structure (i.e. p is a linear function), and which sends the bracket [xy] in q to the 
commutator [p{x),p{y)] in 21. 

In practice groups (resp., algebras) often appear as symmetries (resp., infinitesimal 
generators of symmetries) . These symmetries often act linearly. In other words, in practise 
the preferred associative algebras will usually be matrix algebras, and this is the usual 



See e.g. [25] for more details. Historically, representations of Lie algebras were considered even before 
representations of finite groups. 
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form for a representation and the only kind we will consider. The dimension of these 
representations is the size of the matrices. 

Finding all possible representations, even for the simple Lie algebras, is probably 
hopeless. However, it is possible to find all finite- dimensional representations of the simple 
Lie algebras, and the answer is easy to describe. Given a simple Lie algebra Xg, there is 
a representation Lx for each £-tuple A = (Ai, . . . , A^) of nonnegative integers. A is called 
a highest-weight. Moreover, we can take direct sums Q)iL;^(i) of finitely many of these 
representations. The matrices in such a direct sum will be in block form. It turns out 
that, up to change-of-basis, this exhausts all finite-dimensional representations of Xg. 

It is common to replace 'representation p of g' with the equivalent notion of 'g-module 
M' — i.e. we think of the matrices p{x) as linear maps M -^ M. A Q-module is a vector 
space on which q acts (on the left). Instead of considering the matrix p{x), we consider 
'products' XV (think of this as the matrix p{x) times the column vector v) for v G M. This 
product must be bilinear, and must obey [xy]v = x{yv) — y{xv). 

To get an idea of what Lx looks like, consider Ai. Recall its generators e,f,h and 
relations (1.2.2b). Choose any A G C. Define xq 7^ to formally obey hxo = Xxq and 
exQ = 0. Define inductively Xi+i := fxi for z = 0, 1, . . .. Define Mx to be the span of all Xi 
— we will see shortly that they are linearly independent (so M\ is infinite-dimensional). 
M\ is a module of Ai: the calculations hxi^i = hfxi = {[hf] + fh)xi = (—2/ -|- fh)xi 
and exi+i = efxi = {[ef] + fe)xi — {h -\- fe)xi show inductively that hxm = (A — 2m)xm 
and exm = {X — m + l)mxm-i- From these the reader can show that the xi are linearly 
independent. Mx is called a Verma module; A is called its highest- weight, and xq is called 
a highest- weight vector. 

Now specialise to A = n G Z> := {0, 1,2,.. .}. Note that ea;„_|_i = and hxn+i = 
(— n — 2)xn_(_i. This means that, for these n, M^ contains a suhmodule with highest-weight 
vector Xn+i, isomorphic to M-n-2- Xn+i is called a null vector. In other words, we 
could set Xn+i := and still have an Ai-module. We would then get a finite- dimensional 
module which we'll call L^ := M„/M_^_2 (not to be confused with the Virasoro generator 
in (1.2.7)). Its basis is {xq, xi, . . . , Xn} and so it has dimension n -\- 1. 

For example, take n = 1. Note that what we get in terms of the basis {xq, xi} is the 
familiar representation sl2(C) given in (1.2.5). 

The situation for the other simple Lie algebras Xg is similar. 

It turns out to be hard to compare representations: p and p' could be equivalent 
(i.e. differ merely by a change-of-basis) but look very different. Or if we are given a 
representation, we may want to decompose it into the direct sum of some L^a). When 
working with representations, it is often very useful to avoid much of the extraneous basis- 
dependent detail present in the function p. Finite group theory suggests how to do this: 
we should use characters. The character of an Ai-module M is given by Weyl: write M 
as a direct sum of eigenspaces M{m) of h; then define 



diMiz) := J2 dim M{m) e""^ , (1.3.1) 



for any z E C. The m are called weights and the M{m) weight-spaces. For example, for 
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Ln the weights are to = n, n — 2, . . . , — n, the weight-spaces Ln{m) are Cx(^n-m)/2j 3.nd 



n 



y- g(n-2i)z ^ sin((n + l ) 2) 
■^^ sin(2) 



i=0 



Analogous formulas apply to any algebra Xg: the character will then be a function of an 
^-dimensional subspace t) called the Cartan suhalgehra, spanned by all the hi (see §2.7), 
so can be thought of as a complex-valued function of £ complex variables. The weights m 
will lie in the dual space to f) — i.e. are linear maps [) ^ C — so will have I components. 
See for instance Figure 8 in [59]. Incidentally, i. is called the rank of X^. 

Weyl's definition works: two representations are equivalent iff their characters are 
identical, and M = (BiLy^n) iff chM(^) — '^i^\(.i){.z). It also is enormously simpler: e.g. 
the smallest nontrivial representation of E^ is a map from C^^^ to the space of 248 x 248 
matrices, while its character is a function C^ -^ C But why is Weyl's definition natural? 
How did he come up with it? 

To answer that question, we must remind ourselves of the characters of finite groups^. 
A representation of a finite group G is a structure-preserving map p (i.e. a group homo- 
morphism) from G to matrices. The group's product becomes matrix product. In these 
notes we will be exclusively interested in group representations over C Two represen- 
tations p, p' are called equivalent if there exists a matrix (change-of-basis) U such that 
p'{g) = Up{g)U~^ for all g. The character chp is the map G — *> C given by the trace: 
chp(g) = tr{p{g)). We see that equivalent representations will have the same character, 
because of the fundamental identity tr (AB) = tT{BA). This identity also tells us that 
the character is a 'class function', i.e. chp(hgh~^) = tr{p{h) p{g) p{h)~^) = chp{g) so chp 
is constant on each 'conjugacy class'. Group characters are also enormously simpler than 
representations: e.g. the smallest nontrivial representation of the Monster M (see Part 2) 
consists of almost 10^^ matrices, each of size 196883 x 196883, while its character consists 
of 194 complex numbers. Incidentally, finite group representations behave analogously to 
the representations of Xf. the role of the modules Lx is played by the irreducible repre- 
sentations Pi, and any finite-dimensional representation of G can be decomposed uniquely 
into a direct sum of various pi. The difference is that there are only finitely many pi — 
their number equals the number of conjugacy classes of G. 

We can use this group intuition here. In particular, given any Lie algebra Xi and 
representation p, we can think of the map e^ ^^ gPl^) as a representation of a Lie group 
G{Xi) corresponding to Xg (the exponential e^ of a matrix is defined by the usual power 
series; it will always converge). The trace of the matrix e^^^' will be the group character 
value at e^ G G{Xi), so we'll define it to be the algebra character value at a; G Xg. Again, 
it suffices to consider only representatives of each conjugacy class of G{Xi), because the 
character will be a class function. Now, almost every matrix is diagonalisable (since almost 
any n x n matrix has n distinct eigenvalues), and so it would seem we aren't losing much 
by restricting x E Xg to diagonalisable matrices. Hence we may take our conjugacy class 



Surprisingly, wrhat -we noAv call the characters of group representations were invented almost a decade before 
group representations were. 
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representatives to be diagonal matrices x G Xi, i.e. (for Xi = Ai) to x = zh for z E C 
{h is diagonal in the Xi basis of Lx). So the algebra character can be chosen to be a 
function of z. Finally, the trace of e^^^' = e^^^^' will be given by (1.3.1). This completes 
the motivation for Weyl's character formula. 

There is one other important observation we can make. Different diagonal matrices 
can belong to the same conjugacy class. For instance, 

-l\ (a 0\ (Q -^\~^ _(h 

1 Q ) \Q h) \l J ~ \0 a 

so e^'^ and e~^'^ lie in the same G{Ai) = SL2(C) conjugacy class. Hence ch.M{z) = 
chM{—z). This symmetry z h- >■ —z belongs to the Weyl group for Ai. Each Xg has similar 
symmetries, and the Weyl group plays an important role in the whole theory, sort of 
analogous to the modular group for modular functions we'll discuss in §2.3. 

Weyl found a generalisation of the right-side of (1.3.2), valid for all Xg. The character 
of Lx can be written as a fraction (2.8.1): the numerator will be a alternating sum over 
the Weyl group, and the denominator will be a product over 'positive roots'. This formula 
and its generalisations have profound consequences, as we'll see in §2.8. 

Incidentally, the trigonometric identities given at the beginning of this section are 
the tensor product formula of representations (interpreted as the product and sum of 
characters) , and the fact that an arbitrary character can be written as a polynomial in the 
fundamental characters, both specialised to Ai (see (1.3.2) for the Ai characters). 



1.4. Affine algebras and the Kac-Peterson matrices 

The theory of nontwisted affine Kac-Moody algebras (usually called affine algebras 
or current algebras) is extremely analogous to that of the finite-dimensional simple Lie 
algebras. Nothing infinite-dimensional tries harder to be finite-dimensional than affine 
algebras. Standard references for the following material are [38,41,24]. 

Let Xi be any simple finite-dimensional Lie algebra. The affine algebra X^ is essen- 
tially the loop algebra C{X£), defined to be all possible 'Laurent polynomials' X^nez*^"^" 
where each a-n G Xg and all but finitely many a^ = 0. t here is an indeterminant. The 
bracket in C{Xg) is the obvious one: e.g. [at"^, bt"^] = [ab]t^~^^. Geometrically, C{Xg) is 
the Lie algebra of polynomial maps 5"^ -^ Xg — hence the name (for that realisation, think 
oi t — e^^'^). Hence there are many generalisations of the loop algebra (e.g. any manifold 
in place of S^ will do), closely related ones called toroidal algebras being the Lie algebra 
of maps S^ X ■ ■ ■ X S^ -^ Xg. But the loop algebra is simplest and best understood, and 
the only one we'll consider. Note that C{Xg) is infinite-dimensional. Its Lie groups are the 
loop groups, consisting of all loops 5"^ -^ G{Xg) in a Lie group for Xg. 

We saw S^ before, in the discussion of the Witt algebra. Thus the Virasoro and affine 
algebras should be related. In fact, the Virasoro algebra acts on the affine algebras as 
'derivations', and this connection plays an important technical role in the theory. 

X^ is in the same relation to the loop algebra, that the Virasoro V is to the Witt 
W. Namely, it is its (unique nontrivial 1-dimensional) central extension — see e.g. (7.7.1) 
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of [38] for the analogue of (1.2.7a) here. In addition, for more technical reasons, a further 
(noncentral) l-dimensional extension is usually made: the derivation t^ is included (see 

footnote 33). Xf is the simplest of the infinite-dimensional Kac- Moody algebras. The 
superscript '(!)' denotes the fact that the loop algebra was twisted by an order- 1 auto- 
morphism — i.e. that it is untwisted. It is called 'affine' because of its Weyl group, as we 
shall see. 

Central extensions are a common theme in today's infinite-dimensional Lie theory^. 
Their raison d'etre is always the same: a richer supply of representations. For example, 
W has several representations, but no nontrivial one is an 'irreducible unitary positive- 
energy representation' — the kind of greatest interest in math phys. On the other hand, 
its central extension V has a rich supply of those representations (e.g. there's one for 
each choice of c > 1, /i > 0, namely the Verma module Vc^h corresponding to LqXq = 
hxo, Cxq = cxq)- At the level of groups, central extensions allow projective representations 
(i.e. representations up to a scalar factor) to become true representations. Projective 
representations (hence central extensions) appear naturally in QFT because a quantum 
state vector 1^;) is physically indistinguishable from any nonzero scalar multiple a\v). 

All of the quantities associated to Xg have an analogue here: Dynkin diagram, Weyl 
group, weights,... For instance, the affine Dynkin diagram is obtained from the Dynkin 
diagram for Xg by adding one node. See for example Figure 9 of [59]. The extra node is 
always labelled by a '0'. The Cartan subalgebra () here will be {£ + 2)-dimensional. Many 
of these details will be discussed in more detail in §2.7 below. 

The construction of X^ is so trivial that it seems surprising anything interesting and 
new can happen here. But a certain 'miracle' happens... 

No interesting representation of Xg is finite-dimensional. The analogue for X^ of 
the finite-dimensional representations of Xg are called the integrahle highest-weight rep- 
resentations, and will be denoted L\. The highest-weight X here will be an {£ + l)-tuple 
(Ao, Ai, . . . , A^), Xi G Z> (strictly speaking, it will be an {£ -\- 2)-tuple, but the extra com- 
ponent is not important and is usually ignored). As for Xg, the highest- weights can be 
thought of as the assignment of a nonnegative integer to each node of the Dynkin diagram. 
The construction of L\ is as in the finite-dimensional case. They are called integrahle 
because they are precisely those highest-weight representations which can be 'integrated' 
to a projective representation of the corresponding loop group, and hence a representation 
of a central extension of the loop group. 

We define the character x\ ^^ i^i (1-3-1), though now the weights m will be {£ + 2)- 
tuples, and there will be infinitely many of them. x\ will be a complex-valued function of 
£ + 2 complex variables (i*, r, tt) (see (1.4.1a) below). It can be written as an alternating 
sum over the Weyl group W , over a 'nice' denominator. The difference here is that W is 
now infinite. 

Perhaps most of the interest in affine algebras can be traced to the 'miracle' that 
their Weyl groups are a semidirect product Q^ xi VF of translations in a lattice Q^ (the 
^-dimensional 'co-root lattice' of Xg — see §1.6) with the (finite) Weyl group W oi Xg. 



9 

Incidentally the finite-dimensional simple Lie algebras do not have nontrivial central extensions. 
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See Figure 10 of [59] for the Weyl group of A2 . 'Semidirect product'^^ means that any 
element of W can be written uniquely as (t, w) for some translation t and some w G W, 
and (t, w) o (t', w') = (stuff, w o w'). 

One thing this implies is that x\ will be of the form 'theta function'/denominator. 
Theta functions are classically-studied modular forms (we will discuss these terms in §2.3), 
and thus the modular group SL2(Z) will make an appearance! To make this more precise, 
consider the highest- weight A = (Aq, Ai) of A[ , and write k = Xq + Xi. Then 

Q(fc+2) _ q(A;+2) 
0(2) _ q(2) 

where these functions all depend on 3 complex variables z, r, w, and 

e^r^Hz, r, u) := e-2'^''^" ^ expfyrinr^^ - 2V2mniz] . (1.4.16) 

In (1.4.1a) we can see the alternating sum over the Weyl group of Ai in the numerator 
(and denominator, since we've used the A^ denominator identity in writing (1.4.1a)). 
For general X^ , the denominator will always be independent of A, and the theta function 
(1.4.1b) will become a multidimensional one involving a sum over Q'^ shifted by some 
weight and appropriately rescaled. The (co-)root lattice of Ai is y^X. The key variable in 
(1.4.1a) is the modular one r, which will lie in the upper half complex plane 7i (in order 
to have convergence). In the applications to CFT, the other variables are often set to 0. 

The number k introduced in (1.4.1a) plays an important role in the general theory. 
In the representation Lx, the central term C will get sent to some multiple of the identity 
— the multiplier is labelled k and is called the level of the representation. For any X^ 
there is a simple formula expressing the level k in terms of the highest-weight A; e.g. for 
A^ ' and Cj, it is given by k = Aq + Ai + ■ ■ ■ + A^. Write P^ for the (finite) set of level 

k highest- weights (so the size of P^ for A^ is i ^ )). An important weight in PJ! is 
(/c, 0, . . . , 0). We will denote this '0'. In RCFT it corresponds to the vacuum. 

The modular group SL2(Z) acts on the Cartan subalgebra [) of X^ in the following 
way: 

a b\ ,^ ^ , z ar + b cz-z 

[z, r, u) = ( — -, — -, u 



d)^' ' ' ^cT + d'cT + d' 2{cT + d)' 

Under this action, the characters x\ ^tlso transform nicely: in particular we find for any 
level k weight A 

Xa(-, — ,u-——)= S^ Sxij,Xii{z,T,u) (1.4.2a) 

T T At ^ — ' 

X\{z,T + l,u) = ^ Tx^,x^l{z,T,u) (1.4.26) 



This is also discussed briefly in section 2.2. 
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where 5" and T are complex matrices called the Kac-Peterson matrices. S will always be 
symmetric and unitary, and has many remarkable properties as we shall see. Its entries 
are related to Lie group characters at elements of finite order (see (1.4.5) below). T is 
diagonal and unitary; its entries are related to the eigenvalues of the quadratic Casimir. 

For example, consider A]^ ' at level k. Then S and T will be (/c + 1) x (/c + 1) matrices 
given by 



2 (Ai + lK^xi + l) (Ai + 1)2 TTi 

One important place S appears is the famous Verlinde formula 

N^x, = E ^'^ c"^ ^^^ (1-4-4) 



kEP'^ 



S0,K. 



+ 



for the fusion coefficients N^ of the corresponding RCFT. We will investigate some con- 
sequences of this formula in a later section. The fusion coefficients for the affine algebras 
are well-understood; see e.g. Section 4 of [59] for their interpretation (usually called the 
Kac- Walton formula) as 'folded tensor product coefficients'. 

We will see in §1.7 that symmetries of the extended Dynkin diagram have consequences 
for 5" and T (simple-currents, charge-conjugation). There is a 'Galois action' on S which 
we will discuss in §1.8. There is a strange property of 5" and T called rank-level duality (see 



e.g. [45]): the matrices for A^ at level k are closely related to those of Al._^ at level i + 1, 

and similar statements ho 
interesting is the formula 



and similar statements hold for B\ \ C), and D) . Another reason 5" is mathematically 



The right-side is a character of X^, and A = (Ai, . . . , A^) means ignore the extended node. 
p is the 'Weyl vector' (1,1,..., 1) and hy is called the dual Coxeter number and is the level 
of p. For A\ , hy = l-^l. Of course the right-side can also be regarded as a character for 
a Lie group associated to X^, in which case the argument would have to be exponentiated 
and would correspond to an element of finite order in the group. These numbers (1.4.5) 
have been studied by many people (most extensively by Pianzola) and have some nice 
properties. For instance Moody-Patera (1984) have argued that exploiting them leads to 
some quick algorithms for computing e.g. tensor product coefficients. Kac [37] found a 
curious application for them: a Lie theoretic proof of 'quadratic reciprocity' ! 

Quadratic reciprocity is one of the gems of classical number theory. It tells us that 
the equations 

x^ =a (mod h) 
y^ =b (mod a) 

are related; more precisely, for fixed a and b (for simplicity take them to both be primes 7^ 2) 
the questions of whether there is a solution x to the first equation and a solution y to the 
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second, are related. They will both have the same yes or no answer, unless a = b = 3 (mod 
4), in which case they will have opposite answers. E.g. take a = 23 and 6 = 3, then we 
know the first equation does not have a solution (since a = 2 (mod 3) and x'^ = 2 doesn't 
have a solution mod 3), and hence the second equation must have a solution (indeed, y = 7 
works) . There are now many proofs for quadratic reciprocity, and Kac used Lie characters 
at elements of finite order to find another one. 

What is interesting here is that Kac's proof uses only certain special weights for A^. 
The natural question is: is it possible to find any generalisations of quadratic reciprocity 
using other weights and algebras? Many generalisations of quadratic reciprocity are known; 
will generalising Kac's argument recover them, or will they perhaps yield new reciprocity 
laws? It seems no one knows. 

The relation (1.4.5) is important because it connects finite-dimensional Lie data with 
infinite-dimensional Lie data. The 'conceptual arrow' can be exploited both ways: in the 
generalisations of the arguments of §1.9 to other algebras, (1.4.5) allows us to use our 
extensive knowledge of finite-dimensional algebras to squeeze out some information in the 
affine setting; but also it is possible to use the richer symmetries of the affine data to 
see 'hidden' symmetries in finite-dimensional data. For example it can be used (Gannon- 
Walton 1995) to find a sort of Galois symmetry of dominant weight multiplicities in Xg, 
which would be difficult or impossible to anticipate without (1.4.5). 



1.5. The classification of physical invariants 

We are interested in the following classification problem. Choose any affine algebra 
Xg and level k E Z>. Find all matrices M = {Mx^)^^ aeP'' such that 

(PI) MS = SM and MT = TM, where S, T are the Kac-Peterson matrices (1.4.2); 
(P2) each entry Ma^ G Z>; 
(P3) Moo = 1. 

Any such M, or equivalently the corresponding partition function Z = "^^ Mx^xxX^n 
is called a physical invariant. 

The first and most important classification of physical invariants was the Cappelli- 
Itzykson-Zuber A-D-E classification for A^ at all levels k [8]. We will give their result 
shortly. This implies for instance the minimal model RCFT classification, as well as the 
A^ = 1 super (symmetric) conformal minimal models. The other classifications of compa- 
rable magnitude are a!^^ for all k; Af\ sf ^ and D^^^^ for all /c < 3; {Ai © Ai)^^) for aU 
levels (/ci, /C2); and (w(l) © ■ ■ ■ © u{l))^^^ for all (matrix- valued) levels k. See e.g. [29] for 
references. The most difficult of these classifications is for A2 , done by Gannon (1994). 

In other words, very little in this direction has been accomplished in the 15 or so years 
this problem has existed. But this is not really a good measure of progress. The effort 
instead has been directed primarily towards the full classification; most of these partial 
results are merely easy spin-offs from that more serious and ambitious assault. 

The proof in [8] was very complicated and followed the following lines. First, an 
explicit basis was found for the vector space (called the 'commutant') of all matrices 
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obeying (PI). Then (P2) and (P3) were imposed. Unfortunately their proof was long 
and formidable. Others tried to apply their approach to A2 , but without success. The 
eventual proof for A2 was completely independent of the [8] argument, and exploited 

more of the structure implicit in the problem. As the A2 argument became more refined, 
it became the model for the general assault. In §1.9 we sketch this new approach. 

From this more general perspective, of these completed classifications only the level 2 
Bf, and D^ ones will have any lasting value (the orthogonal algebras at level 2 behave 
very peculiarly, possess large numbers of exceptional physical invariants, and must be 
treated separately). The others behave more generically and will fall out as special cases 
once the more general classifications are concluded. Other classifications which should 
be straightforward with our present understanding are C2 at all k; G2 at all k; and 
B^ and D) at /c = 4. The C2 should be easiest and would imply the C) level 2, as 
well as the B^ and D^ level 5, classifications. A very safe conjecture is that the only 
exceptional physical invariants (we define this term in §1.7) for C2 occur at /c = 3, 7, 8, 12 
— this is known to be true for all k < 500. G2 would be more difficult but also much more 
valuable; its only known exceptionals occur at /c = 3, 4, and these are the only exceptionals 
for k < 500, and a very safe conjecture is that there are no other G2 exceptionals. Bg 

and D^ at level 4 will also be more difficult, but also would be valuable; less is understood 
about its physical invariants and there is a good chance new exceptionals exist there. 

The most surprising thing about the known physical invariant classifications is that 
there so few surprises: almost every physical invariant is 'generic'. We will see that the 
symmetries of the extended Dynkin diagram give rise to general families of physical in- 
variants. We will call any physical invariants which do not arise in these generic ways (i.e. 
using what are called simple-currents or conjugations), exceptional. Many exceptionals 
have been found, and now we are almost at the point where we can safely conjecture the 
complete list of physical invariants for X^ at any /c, for X^ a simple algebra. 

Unfortunately the classification for semi-simple algebras X^^©- ■ -(BXi^ does not reduce 
to the one for simple ones. In fact, any explicit classification of the physical invariants for 
X^^\ for all semi-simple X , would easily be one of the greatest accomplishments in the 
history of math, for it would include as a small part such monumental things as an explicit 
classification of all positive-definite integral lattices. Thus we unfortunately cannot expect 
an explicit classification for the semz-simple algebras. 

To make this discussion more concrete and explicit, consider A\ . For convenience 
drop Ao, so P^ = {0, 1, . . . , /c}. Write J for the permutation (called a simple-current) 

Ja := k — a. Then the complete list of physical invariants for A^ is 



Ak+l = ^ XaP , 
a=0 


for all /c > 1 


fc 
^1+2 = U XaX*J-a , 


k 
whenever — is odd 



a=0 
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^1+2 =IXo + XJO? + \X2 + XJ2\^ H ^2|x||^ , whenever - is even 

^6 = Ixo + X6p + 1X3 + X7p + 1X4 + XioP , for /c = 10 

^7 = IXO + XlQ? + 1X4 + Xl2|^ + 1X6 + Xiol^ 

+ X8 (X2 + Xm)* + (X2 + Xm) Xs + Ixsl^ , for /c = 16 

^8 = IXo + XlO + Xl8 + X28p + 1X6 + Xl2 + Xl6 + X22 P , for /c = 28 . 

The physical invariants An and V^ are generic, corresponding respectively to the order 
1 (i.e. identity) and order 2 (i.e. the simple- current J) Dynkin diagram symmetries, as we 
shall see in §1.7. Physically, they are the partition functions of WZW models on SU2(C) 
and S03(]R) group manifolds, resp. The exceptionals E^ and £% are best interpreted as due 
to the C2,i D Ai^iQ and 6*2,1 D ^1,28 conformal embeddings (see §1.7; standard notation 
is to write 'X^,a;' for 'X^ and level /c'). The £-j exceptional is harder to interpret, but can 
be thought of as the first in an infinite series of exceptionals involving rank-level duality 
and D4 triality. 

Around Christmas 1985, Zuber wrote Kac about the A\ physical invariant problem, 
and mentioned the physical invariants he and Itzykson knew at that point (what we now 
call Ai, and T>even)- A few weeks later, Kac wrote back saying he found one more invariant, 
and jokingly pointed out that it must be indeed quite exceptional as the exponents of Eq 
appeared in it. "I must confess that I didn't pay much attention to that last remark (I 
hardly knew what Coxeter exponents were, at the time!)" [63]. By spring 1986, Cappelli 
arrived in Paris and got things moving again; together Cappelli-Itzykson-Zuber found £^7, 
Vodd^ and then £^§7 and struggled to find more. "And it is only in August [1986], during 
a conversation with Pasquier, in which he was showing me his construction of lattice 
models based on Dynkin diagrams, that I suddenly remembered this cryptic but crucial! 
observation of Victor, rushed to the library to find a list of the exponents of the other 
algebras... and found with the delight that you can imagine that they were matching our 
list" [63]. Thus the A-D-E pattern to these physical invariants was discovered. 



1.6. The A-D-E meta-pattern 

Before we discuss meta-patterns in math, let's introduce the notion of lattice^^ , a 
simple geometric structure we'll keep returning to in these notes. The standard reference 
for lattice theory is [13]. 

Consider the real vector space M"^'"': its vectors look like x = {xj^;x-) where Xj^ 
and X- are m- and n-component vectors respectively, and dot products are given by 
x-y — Xj^-y^ — X--y-. The dot products x± ■ y± are given by the usual product and sum 
of components. For example, the familiar Euclidean (positive-definite) space is M" = M."^'^, 
while Minkowski space is M"^'^. 



There are many words in math which have several incompatible meanings. For example, there are vector 
fields and number fields, and modular forms and modular representations. 'Lattice' is another of these words. 
Aside from the geometric meaning v/e will use, it also refers to a 'partially ordered set'. 
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Now choose any basis S = {xi, . . . , Xm+n} in K"^'"- So R"''" = Rfi H + Rxra+n- 

Define the set A{B) := Zxi H — ■ + Zxm+n- This is a lattice, and all lattices can be formed 
in this way^^. So a lattice is discrete and is closed under sums and integer multiples. For 
example, Z"^'" is a lattice (take the standard basis in M"^'"). A more interesting lattice 

is the hexagonal lattice (also called A2), given by the basis B = {(^, ^), (v^, 0)} of M^ 
— try to plot several points. If you wanted to slide a bunch of coins on a table together 
as tightly as possible, their centres would form this hexagonal lattice. Another important 
lattice is //i,i C M^'^, given hy B = {{A=; 4=), (4=; ^)}; equivalently it can be thought 

of as the set of all pairs (a, b) G Z^ with dot product 

(a, b) ■ (c, d) = ad + be . (1.6.1) 

It is important to note that different choices of basis may or may not result in a 
different lattice. For a trivial example, consider B = {1} and B' = { — 1} in R = M^''^: they 
both give the lattice Z = Z^'^. Two lattices are called equivalent if they only differ by a 
change-of-basis. E.g. B = {(-75, -75), (-75, ~m)} in R^ yields a lattice equivalent to Z^. 

The dimension of the lattice is m + n. The lattice is called positive- definite if it lies 
in some R'" (i.e. n = 0). The lattice is called integral if all dot products x ■ y are integers, 
for X, y G A. A lattice A is called even if it is integral and in addition all norms x ■ x are 
even integers. For example, Z"^'" is integral but not even, while A2 and IIi,i are even. 
The dual A* of a lattice A consists of all vectors x G MJ^'"^ such that x ■ A C Z. So a lattice 
is integral iff A C A*. A lattice is called self-dual if A = A*. Z"*'"^ and IIi,i are self-dual 
but A2 is not. 

There are lots of 'meta-patterns' in math, i.e. collections of seemingly different prob- 
lems which have similar answers. Once one of these meta-patterns is identified it is always 
helpful to understand what is responsible for it. For example, while I was writing up 
my PhD thesis I noticed in several places the numbers 1, 2, 3, 4, and 6. For instance 
cos(27rr) G Q for r G Q iff the denominator of r is 1, 2, 3, 4, or 6. This pattern was easy 
to explain: they are precisely those positive integers n with Euler totient (/){n) < 2, i.e. 
there are at most 2 positive numbers less than n coprime^"^ to n. The other incidences of 
these numbers can usually be reduced to this (/)(n) < 2 property (e.g. the dimension of the 
number field Q[cos(27r^)] (see §1.8) considered as a vector space over Q will be (/){b)/2). 

A more interesting meta-pattern involves the number 24 and its divisors. One sees 24 
wherever modular forms naturally appear. For instance, we see it in the critical dimensions 
in string theory: 24 + 2 and 8-1-2. Another example: the dimensions of even self-dual 
positive-definite lattices must be a multiple of 8 (e.g. the Eg root lattice defined shortly has 
dimension 8, while the Leech lattice discussed in §2.4 has dimension 24). The meta-pattern 
24 is also understood: the fundamental problem for which it is the answer is the following 
one. Fix n, and consider the congruence x^ = 1 (mod n). Certainly in order to have a 
chance of satisfying this, x and n must be coprime. The extreme situation is when every 



12 

In most presentations a lattice is permitted to have smaller dimension than its ambient space, however that 

freedom gains no real generality. 

13 

We say m,n are coprime if any prime p which divides m does not divide n, and vice versa. 
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number x coprime to n satisfies this congruence: 

gcd(a;,n) = 1 <^=^ x^ = 1 (mod n) . (1.6.2) 

The reader can try to verify the following simple fact: n obeys this extreme situation 
(1.6.2) iff n divides 24. 

What does this congruence property have to do with these other occurrences of 24? 
Let A be an even self-dual positive-definite lattice of dimension n. Then an elementary 
argument shows that there will exist an n-tuple a = (ai, . . . , an) of odd integers with the 
property that 8 must divide a ■ a = ^^ of. But of = 1 (mod 8), and so we get 8|n. 

A much deeper and still not-completely-understood meta-pattern is called A-D-E (see 
[1] for a discussion and examples). The name comes from the so-called simply-laced alge- 
bras, i.e. the simple finite-dimensional Lie algebras whose Dynkin diagrams — see Figure 
6 in [59] — contain only single edges (i.e. no arrows). These are the A^- and D^^-series, 
along with the Eq, E-j and E^ exceptionals. The claim is that many other problems, which 
don't seem to have anything directly in common with simple Lie algebras, have a solution 
which falls into this A-D-E pattern (for an object to be meaningfully labelled X^, some of 
the data associated to the algebra X(^ should reappear in some form in that object). Let's 
look at some examples. 

Consider even positive-definite lattices A. The smallest possible nonzero norm in A 
will be 2, and the vectors of norm 2 are special and are called roots. The reason they are 
special is that refiecting through them will always be an automorphism of A. That is, the 
refiection u i-^ u — 2^^ a through a 7^ won't in general map A to itself, unless a is a root 
of A. It is important in lattice theory to know the lattices which are spanned by their roots; 
it turns out these are precisely the orthogonal direct sums of lattices called A^, D^, and 
Eq, Ef and Eg. They carry those names for a number of reasons. For example, the lattice 
called Xn will have a basis {ai, . . . , a^} with the property that the matrix Aij := di ■ dj is 
the Cartan matrix (see §2.7) for the Lie algebra X^l Also, the refiection group generated 
by refiections in the roots of the lattice X^ will be isomorphic to the Weyl group of the 
Lie algebra X„,- Finally, to any simple Lie algebra there is canonically associated a lattice 
called the root lattice; for the simply-laced algebras, these will equal the corresponding 
lattice of the same name. Incidentally, the root lattices for the non-simply-laced simple 
algebras will (up to rescalings) be direct sums of the simply-laced root lattices. 

We have already met the A2 lattice: it is the densest packing of circles in the plane. It 
has long been believed that the obvious pyramidal way to pack oranges is also the densest 
possible way — the centres of the oranges form the A3 root lattice. A controversial proof 
for this famous conjecture has been offered by W.-Y. Hsiang in 1991; in 1998 a new proof 
by Hale et al has been proposed. The densest known packings in dimensions 4,5,6,7,8 are 
L>4, L>5, Eq, Ejj Es, resp. Eg is the smallest even self-dual positive-definite lattice. 

A famous A-D-E example is called the McKay ^'^ correspondence. Consider any finite 
subgroup G of the Lie group SU2(C) (i.e. the 2x2 unitary matrices with determinant 1). 
For example, there is the cyclic group Z„ of n elements generated by the matrix 

M^ = ( exp[27ri/n] 

exp[— 27ri/n] 
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He is the same John McKay we will celebrate in section 2.1. 
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Let Ri be the irreducible representations of G. For instance, for Z^, there are precisely 
n of these, all l-dimensional, given by sending the generator M„ to exp[2/c7ri/?i] for each 
/c = 1, 2, . . . , n. Now consider the tensor product G Ri, where we interpret G C SU2(C) 
here as a 2-diniensional representation. We can decompose that product into a direct sum 
(BjniijRj of irreducibles (the rriij here are multiplicities). Now create a graph with one 
node for each Ri, and with the zth and jth nodes (z 7^ j) connected with precisely rriij 
directed edges i ^ j. If rriij = rnji, we agree to erase the double arrows from the rriij 
edges. Then McKay observed that the graph of any G will be a distinct extended Dynkin 
diagram of A-D-E type! For instance, the cyclic group with n elements corresponds to the 
extended graph of A^-i. 

How was McKay led to his remarkable correspondence? He knew that the sum of 
the 'marks' a^ = 1, 2, 3, 4, 5, 6, 4, 2, 3 associated to each node of the extended E^ Dynkin 
diagram equaled 30, the Coxeter number of E^. So what did their squares add to? 120, 
which he recognised as the cardinality of one of the exceptional finite subgroups of SU2(C), 
and that got him thinking... 

Another famous example of A-D-E, due to Arnol'd, are the 'simple critical points' of 
smooth complex- valued functions /, on e.g. C^. For example, both x^ + y^ + z"^^^ and 
x'^ ^- y'^ -\- z^ have singularities at (0, 0, 0) (i.e. their first partial derivatives all vanish there), 
and they are assigned to A^ and E'g, respectively. The SU2(C) subgroups can be related 
to singularities as follows. The group SU2(C) acts on C^ in the obvious way (matrix 
multiplication). If G is a discrete subgroup, then consider the (ring of) polynomials in 
2 variables wi,W2 invariant under G. It turns out it will have 3 generators x{wi,W2), 
y{wi, W2), z{wi, W2), which are connected by 1 polynomial relation (syzygy). For instance, 
take G to be the cyclic group !,„,, then we're interested in polynomials p{wi, W2) invariant 
under wi 1— *> exp[27ri/n]wi, W2 ^— > exp[— 27ri/n]w2- Any such invariant p{wi,W2) is clearly 
generated by (i.e. can be written as a polynomial in) W1W2, w" and ^2 • Choosing instead 

the generators x = ^^ ^"^ , y = i ^^ ^^^ ^ ^ = W1W2, we get the syzygy 2"" = — (x^ + y^). 
For any G, generators x, y, z can always be found so that the syzygy will be one of the 
polynomials associated to a simple singularity, and in fact will give the equation of the 
algebraic surface C^/G as a 2-dimensional complex surface in C^ (e.g. the complex surfaces 
C^/Z^ and {{x, y, z) G C^ | x^ 4- y^ -f- 2;" = 0} are equivalent). 

Arguably the first A-D-E classification goes back to Theaetetus, around 400 B.C. He 
classified the regular solids. For instance the tetrahedron can be associated to Eq while the 
cube is matched with E-j. This A-D-E is only partial, as there are no regular solids assigned 
to the A-series, and to get the D-series one must look at 'degenerate regular solids'. 

The closest thing to an explanation of the A-D-E meta-pattern would seem to be 
the notion of 'additive assignments' on graphs (which is a picturesque way of describing 
the corresponding eigenvalue problem). Consider any graph Q with undirected edges, and 
none of the edges run from a node to itself. We can also assume without loss of generality 
that Q is connected. Assign a positive number a^ to each node. If this assignment has 
the property that for each i, 2ai = '^aj where the sum is over all nodes j adjacent to i 
(counting multiplicities of edges), then we call it 'additive'. For instance, for the graph 
0=0, the assignment ai = 1 = 02 is additive, but the assignment ai = 1,02 = 2 is not. 
The question is, which graphs have an additive assignment? The answer is: precisely the 

22 



extended Dynkin diagrams of A-D-E type! And their additive assignments are unique (up 
to constant proportionality) and are given by the marks ai of the algebra (see e.g. the Table 
on p. 54 of [38]). For example the extended A^ graph consists of n + 1 nodes arranged in 
a circle, and its marks a^ all equal 1. 

What do additive assignments have to do with the other A-D-E classifications? Con- 
sider a finite subgroup G of SU2(C). Take the dimension of the equation G®Ri = (BjiriijUf 
we get 2di = ^ niijdj where dj = dini(Rj). Hence the dimensions of the irreducible rep- 
resentations define an additive assignment for each of McKay's graphs, and hence those 
graphs must be of A-D-E type (provided we know niij = niji). 

As Cappelli-Itzykson-Zuber observed, the physical invariants for A\ also realise the 
A-D-E pattern, in the following sense. The Coxeter number h of the name Xi (i.e. the sum 
J2i o^i of the marks) equals /c + 2, and the exponents nii of Xg equal those a G P^ for which 
Maa 7^ (for the simply-laced algebras, the m^ are defined by writing the eigenvalues 
of the corresponding Cartan matrix (see §2.7) as 4 sin {^^^) — the rrii are integers and 
the smallest is always 1). Probably what first led Kac to his observation about the Eq 
exponents was that k + 2 (this is how k enters most formulas) for his exceptional equalled 
the Coxeter number 12 for Eq. More recently, the operator algebraists Ocneanu [48] and 
independently Bockenhauer-Evans [4] found an A-D-E interpretation for the off-diagonal 

entries Mab of the A^ physical invariants, using subfactor theory. 

We are not claiming that this Ai classification is 'equivalent' to any other A-D-E 
one — that would miss the point of meta-patterns. What we really want to do is to 
identify some critical combinatorial part of an Ai proof with critical parts in other A- 
D-E classifications — this is what we did with the other meta-patterns. A considerably 
simplified proof of the A^ classification is now available [29], so hopefully this task will 
now be easier. 

There has been some progress at understanding this A^ A-D-E. Nahm [46] con- 
structed the invariant Xg in terms of the compact simply-connected Lie group of type Xg, 
and in this way could interpret the k + 2 = h and M^^^. ^ coincidences. A very general 
explanation for A-D-E has been suggested by Ocneanu [48] using his theory of path alge- 
bras on graphs; although his work has never been published, others are now rediscovering 
(and publishing!) similar work (see e.g. [4]). Nevertheless, the A-D-E in CFT remains 
almost as mysterious now as it did a dozen years ago — for example it still isn't clear how 
it directly relates to additive assignments. 

There are 4 other claims for A-D-E classifications of families of RCFT physical invari- 
ants, and all of them inherit their (approximate) A-D-E pattern from the more fundamental 
A\ one. One is the c < 1 minimal models, also proven in [8], and another is the A^ = 1 
superconformal minimal models, proved by Cappelli (1987). In both cases the physical 
invariants are parametrised by pairs of A-D-E diagrams. The list of known c = 1 RCFTs 
also looks like A-D-E (two series parametrised by Q+, and three exceptionals) , but the 
completeness of that list has never been rigourously established. 

The fourth classification often quoted as A-D-E, is the N = 2 superconformal minimal 
models. Their classification was done by Gannon (1997). The connection here with A-D-E 
turns out to be rather weak: e.g. 20, 30, and 24 distinct invariants would have an equal 
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right to be called Sq, £-j, and £^8 respectively. It would appear that the frequent claims 
that the N = 2 minimal models fall into an A-D-E pattern are rather dubious. 

Hanany-He [35] suggest that the Al A-D-E pattern can be related to subgroups 
G C SU2(C) by orbifolding 4-dimensional A^ = 4 supersymmetric gauge theory by G, 
resulting in an A = 2 superCFT whose 'matter matrix' can be read off from the Dynkin 
diagram corresponding to G. The same game can be played with finite subgroups of 
SU3(C), resulting in A = 1 superCFTs whose matter matrices correspond to graphs very 
reminiscent of the 'fusion graphs' of Di Francesco- Petkova-Zuber (see e.g. [62]) correspond- 
ing to A2 physical invariants. [35] use this to conjecture a McKay-type correspondence 

between singularities of type C^/G, for G C SUn(C), and the physical invariants of A„_i. 
This in their view would be the form A-D-E takes for higher rank physical invariants. 
Their actual conjecture though is still somewhat too vague. 

For a final example of meta-pattern, consider 'modular function' (see §2.3). After all, 
they appear in a surprising variety of places and disguises. Maybe we shouldn't regard 
their ubiquity as fortuitous, instead perhaps there's a deeper common 'situation' which 
is the source for that ubiquity. Just as 'symmetry' yields 'group', or 'rain-foUowed-by- 
heat' breeds mosquitos. Math is not above metaphysics; like any area it grows by asking 
questions, and changing your perspective — even to a metaphysical one — should suggest 
new questions. 



1.7. Simple-currents and charge-conjugation 

The key properties'^ of the matrix S are that it's unitary and symmetric (so M in 
§1.5 equals SMS*), 

So^ > for all n E P+ , (1.7.1) 

and that the numbers N^ defined by Verlinde's formula (1.4.4) are nonnegative integers. 
These are obeyed by the matrix 5" in any (unitary) RCFT. From these basic properties, 
we will obtain here some elementary consequences which have important applications. 

But first, let's make an observation which isn't difficult to prove, but doesn't appear 
to be generally known. 

Verlinde 's formula looks strange, but it is quite generic, 

and we can see it throughout math and mathematical physics. Consider the following. 

Let 21 be a commutative associative algebra, over M say. Suppose 21 has a finite basis 
$ (over R) containing the unit 1. Define the 'structure constants' A^^ G M, for a,b,c E $, 
by ah = Xlce* ^ab'^- Suppose there is an algebra homomorphism * (so * is linear, and 
(xy)* = x*y*) which permutes the basis vectors (so $* = $), and we have the relation 
^ab ~ ^b,a* ■ We Call any such algebra 21 a fusion algebra. Then any fusion algebra will 
necessarily have a unitary matrix 5" with Sia > and with the structure constants given by 
Verlinde's formula. Algebraically, the relation S = S^ holds if 21 is 'self-dual' in a certain 
natural sense. 
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A good exercise for the reader is to prove that if S is unitary and symmetric, and obeys (1.7.1), then there 
will be at most finitely many physical invariants M for that S,T. 
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Define the 'fusion matrices' Nx by {Nx)^^, = NY. Then Verhnde's formula says that 
the /Uth column 5*1^ of 5" is an eigenvector of each fusion matrix Nx, with eigenvalue -^. 

Useful Fact. If z; is a simultaneous eigenvector of each fusion matrix Nx, then there exists 
a constant c G C and a A G P^ such that v = c S^x- 

For one consequence, take the complex conjugate of the eigenvector equation Nx S^fj, = 
-^ S^fj,: we get that the vector Si is a simultaneous eigenvector of all Nx, and hence 

must equal cS^^ for some number c and weight 7 G P^, both depending on fx. Write 
7 = C//; then C defines a permutation of P^. The reader can verify that unitarity of 5" 
forces \c\ = 1, while (1.7.1) forces c > 0. Thus c = 1 and we obtain the formula 

'S'a/x = 5'a,cm = Sc\,fi ■ (1-7.2) 

Also, unitarity and symmetry of S forces (7 = 5*^, while conjugating twice shows C^ = id. 
C is an important matrix in RCFT, and is called charge-conjugation. When C = id., then 
the matrix S is real. 

Note that (1.7.1) now implies CO = 0. Also CT = TC. Hence M = C will always 
define a physical invariant, and if M is any other physical invariant, the matrix product 
MC = CM will define another physical invariant. Also, A^^aCm ^ ^^^J■ ^^^ (^a)* = 
CNx = NxC = Ncx. 

For the WZW (=affine) case, C has a special meaning: CX is the highest-weight 'con- 
tragredient' to A. C corresponds to an order 2 (or 1) symmetry of the (unextended) Dynkin 
diagram. For example, for Ag , we have C{Xq, Ai, . . . , A^-i, A^) = (Aq, A^, A^-i, . . . , Ai). 
For A^ then, C = id., which can also be read off from (1.4.3). 

The algebras Deven all have at least one nontrivial symmetry of the (unextended) 
Dynkin diagram which isn't the charge-conjugation. The most interesting example is D]^ , 
which has 5 of these. By a conjugation, we will mean any symmetry of the unextended 
Dynkin diagram. 

To go much further, we need a fascinating tool called Perron- Fro benius theory — a 
collection of results concerning the eigenvalues and eigenvectors of nonnegative matrices 
(i.e. matrices in which every entry is a nonnegative real number). Whenever you have 
such matrices in your problem, and it is natural to multiply them, then there is a good 
chance Perron-Frobenius theory will tell you something interesting. The basic result here 
is that if A is a nonnegative matrix, then there will be a nonnegative eigenvector x > 
with eigenvalue p > 0, such that if A is any other eigenvalue of A, then |A| < p. There are 
lots of other results (see e.g. [44]), e.g. p must be at least as large as any diagonal entry of 
A, and there must be a row-sum of A no bigger than p, and another row-sum no smaller 
than p. 

For instance, consider 




A= 1 1 1 , B 
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1\ 2 

Perron-Frobenius eigenvectors for A and B are | 1 and I 1 | , with eigenvalues 3 and 

1/ VI, 

2 resp. The other eigenvalue of A is (multiplicity 2), while those of B are and —1. 
Fusion matrices Nx are nonnegative, and it is indeed natural to multiply them: 



NxN, = Y, Nl^N, 



vePl 

So we can expect Perron-Frobenius to tell us something interesting. This is the case, and 
we obtain the curious-looking inequalities 

SxqSq^ > \S\^\ Sqo . (1.7.3) 

Squaring both sides, summing over fi and using unitarity, we get that S\o > Sqq. In other 
words, the ratio ^^^, called the quantum- dimension of A, will necessarily be > 1. 

The term 'quantum-dimension' comes from quantum groups, where -^ is the quantum- 
dimension of the module labelled by A of the quantum group Uq{X£). 

The borderline case then ^^ is when a quantum-dimension equals 1. Any such weight 
is called a simple- current. The theory of simple- currents was developed most extensively 
by Schellekens and collaborators (see e.g. [50]). The simple-currents for the affine algebras 
were classified by J. Fuchs (1991), and the result is that (with one unimportant exception: 
E^ ' at level 2) they all correspond to symmetries of the extended Dynkin diagrams. In 
particular, applying any such symmetry to the vacuum ~ (/c, 0, . . . , 0) gives the list of 
simple-currents. For instance, the £ + 1 weights of the form (0, . . . , /c, . . . , 0) (/c in the ith 
spot) are the simple-currents for A^ . There are 2 simple-currents for B^ , C) and Ej , 

3 for Eq , and 4 for D^ . Simple-currents play a large role in RCFT, as we shall see. 

Let j be any simple-current. Then (1.7.3) becomes 5*0^ > \Sj^\ for all (U, so unitarity 
forces 5*0^ = \Sj^\, that is 

5',^ = exp[27riQ,(;u)]5o^ V^u G P^ (1.7.4) 

for some rational numbers < Qj{n) < 1. Hence by diagonalising, we get NjNcj = I- 
But the inverse of a nonnegative matrix A is itself nonnegative, only if A is a 'generalised 
permutation matrix', i.e. a permutation matrix except the I's can be replaced by any pos- 
itive numbers. But Nj and Ncj are also integral, and so they must in fact be permutation 
matrices. Write {Nj)\^ = d^^jx for some permutation J of P^. So j = JO. Then 



This seems to be a standard trick in math: when some sort of bound is estabhshed, look at the extremal 
cases which realise that bound. If your bound is a good one, it should be possible to say something about those 
extremal cases, and having something to say is always of paramount importance. This trick is used for instance 
in the definition of 24 last section, and the definition of normal subgroup in section 2.2. 
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so taking absolute values and using the triangle inequality and unitarity of 5", we find that 
(1.7.4) generalises: 

Sj\,f^ = exppyri QjifJ.)] 5'a^ . (1.7.5) 



The simple-currents form a finite abelian group, corresponding to the composition of 



the permutations J. For any simple- currents J, J', we get the symmetry Nj^^ j, = N"^"^ ^ 



The Q/Z-valued functions Qj define gradings on the fusion rings, and conversely any 
grading corresponds to a simple-current in this way. 

For example, the simple-current j = (0, k) of A[ at level k corresponds to Qj{X) = 

Ai/2 and the permutation JX = (Ai, Aq). We can see this directly from (1.4.3). For A2 
level k, there are 2 nontrivial simple-currents, (0, /c, 0) and (0, 0, /c). The first of these 
corresponds to triality Q{X) = (Ai + 2A2)/3 and A 1-^ (A2,Ao,Ai), while the second to 
(2Ai -|- A2)/3 and A 1-^ (Ai, A2, Aq). Similar statements hold for all affine algebras: e.g. for 
B^ level /c, the nontrivial simple-current has Qj{X) = Xi/2 and JX = (Ai, Aq, A2, . . . , A^). 
One of the applications of simple-currents is that physical invariants can be built from 
them in generic ways. These physical invariants all obey the selection rule 

Mx^ 7^ =^ n = JX for some simple — current J = J(A, n) . (1.7.6) 

We will call any such physical invariant M a simple- current invariant. A special case is 
the ^fe_|_2 physical invariant for A\ at even level k. Up to a fairly mild assumption, 
all simple-current invariants have been classified for any RCFT by Schellekens and col- 
laborators; given that assumption, they can all be constructed by generic methods. The 
basic construction is due to Bernard [3], though it has been generalised by others. In the 
WZW case, all simple-current invariants (except some for D^ ) correspond to strings on 
nonsimply-connected Lie groups. 

By a generic physical invariant of Xg we mean one of the form M = C M'C" where 
C',C" are (charge-)conjugations, and M' is a simple-current invariant. In other words, 
M is constructed in generic ways from symmetries of the extended Dynkin diagram of Xg. 
Any other M are called exceptional. 

All known results point to the validity of the following guess: 

Conjecture. Choose a simple algebra Xi. Then for all sufficiently large k, all physical 
invariants of Xg ' at level k will be generic. 

In other words, any given X^ will have only finitely many exceptionals. For instance, 
for A\ and A2 at any /c > 28 and k > 21 resp., all physical invariants are generic. For 
C2 and G2 \ k > 12 and k > 4 resp. should work. 

The richest source of exceptionals are conformal embeddings. In some cases the affine 
representations Lx for some algebra X^ (necessarily at level 1) can be decomposed into 
finite direct sums of representations of some affine subalgebra Yrh (at some level k). In 
this case, a physical invariant for X^ level 1 will yield a physical invariant for YX level 
/c, obtained by replacing every X^ level 1 character x\ by the appropriate finite sum of 

27 



Yrn level k characters. An example will demonstrate this simple idea: A\ level 28 is a 
conformal subalgebra of G2 level 1, and we have the character decompositions 

X(1,0,0) =X(28,0) +X(18,10) +X(io,18) +X(o,28) 
X(0,0,1) =X(22,6) +X(16,12) +X(12,16) + X(6,22) • 



Thus the unique level 1 G2 physical invariant |xoP + |X(o,o,i) P yields what we call the £^8 

physical invariant oi A\ . All level 1 physical invariants are known, as are all conformal 
embeddings and the corresponding character decompositions (branching rules). 



1.8. Galois Theory 

Evariste Galois was a brilliantly original French mathematician. Born shortly before 
Napoleon's ill-fated invasion of Russia, he died shortly before the ill-fated 1832 uprising in 
Paris. His last words: "Don't cry, I need all my courage to die at 20" . 

Galois grew up in a time and place confused and excited by revolution. He was known 
to say "if I were only sure that a body would be enough to incite the people to revolt, 
I would offer mine". On May 2 1832, after frustration over failure in love and failure to 
convince the Paris math establishment of the depth of his ideas, he made his decision. A 
duel was arranged with a friend, but only his friend's gun would be loaded. Galois died the 
day after a bullet perforated his intestine. At his funeral it was discovered that a famous 
general had also just died, and the revolutionaries decided to use the general's death rather 
than Galois' as a pretext for an armed uprising. A few days later the streets of Paris were 
blocked by barricades, but not because of Galois' sacrifice: his death had been pointless 
[56]. 

Galois theory in its most general form is the study of relations between objects defined 
implicitly by some conditions. For example, the objects could be the solutions to a given 
differential equation. In the incarnation of Galois we are interested in here, the objects are 
numbers, namely the zeros of certain polynomials. We will sketch this theory below, but 
see e.g. the article by Stark in [60] for more details. 

Gauss seems to have been the first to show that 'weird' (complex) numbers could tell 
us about the integers. For instance, suppose we are interested in the equation n = a"^ + b'^. 
Consider 5 = 2^ -(- 1^. We can write this as 5 = (2 -|- i)(2 — i), so we are led to consider 
complex numbers of the form a + bi, for a,b ^Z. These are now called 'Gaussian integers'. 
Suppose we know the following theorem: 

Fact. Let p G Z be any prime number. Then p factorises over the Gaussian integers iff 
p — 2 or p = 1 (mod 4). 

By 'factorise' there, we mean p = zw where neither z nor w is a 'unit': ±1, ±i. 

Now suppose p is a prime, = 2 or = 1 (mod 4), and we write p = {a + bi){c + di). 
Then p'^ = (a^ + 6^)(c^ + rf^), so a'^ + b'^ = c^ + (f = p. Conversely, suppose p = a^ + 6^, 
then p = (a + bi){a — bi). Thus: 

28 



Consequence. ^^ Let p G Z be any prime number. Then p = a^ + 6^ for a, 6 G Z iff p = 2 
or p = 1 (mod 4). 

Now we can answer the question: can a given n be written as a sum of 2 squares 
n = a'^ + b'^7 Write out the prime decomposition n = YIp"^^- Then n = a'^ + b'^ has a 
solution iff ap is even for every p = 3 (mod 4). For instance 60 = 2^ ■ 3^ ■ 5^ cannot be 
written as the sum of 2 squares, but 90 = 2^ ■ 3^ ■ 5^ can. We can also find (and count) all 
solutions: e.g. 90 = 2 ■ 3^ ■ 5 = {(1 + i)3(l + 2i)}{(l - i)3(l - 2i)}, giving 90 = (-3)^ + 9^. 

This problem should give the reader a small appreciation for the power of using non- 
integers to study integers. Nonintegers often lurk in the shadows, secretly watching their 
more arrogant brethren the integers strut. One of the consequences of their presence can 
be the existence of certain 'Galois' symmetries. Such happens in RCFT, as we will show 
below. 

Look at complex conjugation: (wz)* = w*z* and {w + z)* = w* + z* . Also, r* = r 
for any r G M. So we can say that * is a structure-preserving map C — > C (called an 
automorphism of C) fixing the reals. We will write this * G Gal(C/]R). 'Gal(C/R)' is the 
Galois group of C over M; it turns out to contain only * and the identity. 

A way of thinking about the automorphism * is that it says that, as far as the real 
numbers are concerned, i and — i are identical twins. 

Let F be any field containing Q (we defined 'field' in §1.2). The Galois group Gal(F/Q) 
then will be the set of all automorphisms=symmetries of F which fix all rationals. 

For example, take F to be the set of all numbers of the form a + h\/h^ where a, 6 G Q. 
Then F will be a field, which is commonly denoted Q[\/5] because it is generated by Q 
and a/5- Let's try to find its Galois group. Let a G Gal(F/Q). Then a{a + 6a/5) = 
a (a) + (t(6)(t(\/5) = a + 6a (v^), so once we know what a does to \/b, we know everything 

about a. But 5 = cr(5) = a{\f^ ) = (cr(\/5))^, so a{\/b) = ±a/5 and there are precisely 2 
possible Galois automorphisms here (one is the identity). As far as Q is concerned, ±\/5 
are interchangeable: it cannot see the difference. 

For a more important example, consider the cyclotomic field F = Q[^n], where $,n '■= 
exp[27ri/n] is an nth root of 1. So Q[^n] consists of all complex numbers which can be 
expressed as polynomials a^^^ + am-iCT~^ + ■ ■ ■ + ao in ^^ with rational coefficients 
Gi- Once again, to find the Galois group Gal(Q[^n]/Q)5 it is enough to see what an 
automorphism a does to the generator ^n- Since ^^ = 1, we see that it must send it 
to another nth root of 1, ^^ say; in fact it is easy to see that fj{^n) must be another 
'primitive' nth root of 1, i.e. £ must be coprime to n. So Gal(Q[^n]/Q) will be isomorphic 
to the multiplicative group Z^ of numbers between 1 and n coprime to n. The rationals 
can't see any difference between the primitive nth roots of 1 — for instance Q can't tell 
that ^^^ are 'closer to 1' than the other primitive roots. So any a G Gal(Q[^n]/Q) will 

^ , and to see what a does to some z G Q[^n] 



correspond to some £ G Z^ , and to see what a does to some z G Q,[Cn] what we do is write 



2 as a polynomial in ^„ and then replace each occurrence of ^^ with ^^. For example 

(ea I d — a\ tat i p — ai 

^" ^^ J = ^" ^" = cos(27ra£/n) . 



17 

This result was first stated by Ferniat in one of his infamous margin notes (another is discussed in Section 

2.3), and Avas finally proved a century later by Euler. A remarkable 1-line proof was found by Zagier [61]. 
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So to summarise, Galois automorphisms are a massive generalisation of the idea of 
complex conjugation. If in your problem complex conjugation seems interesting, then there 
is a good chance more general Galois automorphisms will play an interesting role. This is 
what happens in RCFT, as we now show. 

Fact. [14] Suppose S is unitary and symmetric and each Soa > 0. 

(a) If in addition the numbers iV^^ given by Verlinde's formula (1.4.4) are rational, then 
the entries Sab of S must lie in a cyclotomic field. 

(b) The numbers A^^^ will be rational iff for any a E Gal(Q[5']/Q), there is a permutation 
a 1-^ a*^, and a choice of signs eo-(a) G {±1}, such that 

(^{Sab) = eo-(a) Sa-^^b = ^a{b) Sa,b'^ ■ (1.8.1) 

'QfS"]' in part (b) denotes the field generated by Q and all matrix entries Sab- The 
argument follows the one given for the charge-conjugation C at the beginning of the last 
section. The kinds of complex numbers which lie in cyclotomic fields are sin(7rr), cos(7rr), 
^/r and ri for any r G Q. Almost all complex numbers fail to lie in any cyclotomic field: 
e.g. generic cube roots, 4th roots, ..., of rationals, as well as transcendental numbers like 
e, IT and e^. 

Of course the affine algebras satisfy the conditions of the Fact, as does more generally 
the modular matrix 5" for any unitary RCFT, and so these will possess the Galois action. 
For the affine algebras this action has a geometric interpretation in terms of multiplying 
weights by an integer i and applying Weyl group elements — see [14] for a description. 

This Fact is useful in both directions: as a way of testing whether a conjectured 
matrix S has a chance of producing the integral fusions we want it to yield; and more 
importantly as a source of a symmetry of the RCFT which generalises charge-conjugation. 
Any statement about charge-conjugation seems to have an analogue for any of these Galois 
symmetries, although it is usually more complicated. 

As an example, consider A\ : (1.4.3) shows explicitly that Sx^ lies in the cyclotomic 
field Q[^4(A;+2)]- Write {x} for the number congruent to x mod 2{k+2) satisfying < {x} < 
2{k + 2). Choose any Galois automorphism a, and let i. G ^4(/j_|_2) be the corresponding 
integer. Then if {£(a+l)} < A;+2, we wiUhave a'^ = {£(a-F-l)}-l, while if {£(a-M)} > k+2, 
we'll have a^ = 2{k + 2) — {£(a -|- 1)} — 1. The sign €^{0) will depend on a contribution 

from \/ -^12 (which for most purposes can be ignored), as well as the sign +1 or —1, resp., 

depending on whether or not {i{a + 1)} < k + 2. 

Consider specifically /c = 10, and the Galois automorphism a 5 corresponding to £ = 5. 
Then the permutation is ^ (6,4), (9,1) ^ (1,9), (8,2) ^ (2,8), (4,6) ^ (0,10), while 
(7, 3) and (5,5) are fixed. 

This Galois symmetry has been used to find certain exceptional physical invariants, 
but its greatest use so far is as a powerful selection rule we will describe next section. 
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1.9. The modern approach to classifying physical invariants 

In this final section we include some of the basic tools belonging to the 'modern' 
classifications of physical invariants, and we give a fiavour of their proofs. We will state 
them for the A\ level k problem given above, but everything generalises without effort. 
See [29] and references therein for more details. Recall the matrices S,T in (1.4.3). 

First note that commutation of M with T implies the selection rule 

Mx^^O ^ (Ai + 1)2 = (^1 + 1)2 (mod 4(A; + 2)) . (1.9.1) 

It is much harder to squeeze information out of the commutation with S, but the resulting 
information turns out surprisingly to be much more useful. In fact, commutation with S 
is almost incompatible with the constraint Mx^ G Z> . 

Note that the vacuum G P^ is both physically and mathematically special; our 
strategy will be to find all possible 0th rows and columns of M, and then for each of these 
possibilities to find the remaining entries of M. 

The easiest result follows by evaluating MS = SM at (0, A) for any A G P^: 



Y,Mo^S^x>0, (1.9.2) 



with equality iff the Ath column of M is identically 0. (1.9.2) has two uses: it severely 
constrains the values of Mq^ (similarly M^o), and it says precisely which columns (and 
rows) are nonzero. 

Next, let's apply the triangle inequality to sums involving (1.7.5). Choose any z,j G 
{0, 1}. Then 

Mj.o,j.o = E(-l)^'' ^OA Mx^ i-ir^^ V • 
Taking absolute values, we obtain 

Mjio,j,o < Yl ^ox Mx^ ^om = Moo = 1 . 

A,^ 

Thus MjiQjjQ can equal only or 1. If it equals 1, then we obtain the selection rule: 

Aiz = nij (mod 2) whenever Mx^ 7^ ; 

this implies the symmetry Mjix^jj^ = Mx^ for all A, /U G P^. We can see both of these in 

the list of physical invariants for A-^ level k. This explains a lot of the properties of those 
invariants. For instance, try to use this selection rule to explain why no Xodd appears in 
the exceptional called £^8- 

Our M is nonnegative, and although multiplying M's may not give us back a physical 
invariant, it will give us a matrix commuting with 5" and T. In other words, the commutant 
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is much more than merely a vector space, it is in fact an algebra. Thus we should expect 
Perron-Frobenius to tell us something here. A first application is the following. 

Suppose Mxo = d\^o — i.e. the 0th column of M is all zeros except for Mqo = 1. Then 
Perron-Frobenius implies (with a little work) that M will be a permutation matrix — i.e. 
there is some permutation tt of P^: such that M\^ = 5^,7rA, and S'jrA.Tr^ = S\^. This nice 
fact applies directly to the Ai, and Vodd physical invariants of A[ . 

This is proved by studying the powers [M* M)^ as L goes to infinity: its diagonal 
entries will grow exponentially with L, unless there is at most one nonzero entry on each 
row of M, and that entry equals 1. 

More careful reasoning along those lines tells us about the other generic situation here. 
Namely, suppose M\q ^ only for A = and A = JO, and similarly for Mqa — i-e. the 0th 
row and column of M are all zeros except for MjiQ jjq = 1. Then the Ath row (or column) 
of M will be identically iff Ai is odd. Moreover, let A, fx be any non-fixed-points of J, 
and suppose Mx^ ^ 0. Then 

, , _ J 1 \i V = [I ox V = J [i 
[0 otherwise 

with a similar formula for Mj^^. This applies to the Veven and £r invariants of Al . 

Our final ingredient is the Galois symmetry (1.8.1) obeyed by S. Choose any Galois 
automorphism a. It will correspond to some integer i coprime to 2{k + 2). From (1.8.1) 
and M = SMS* we get, for all A, //, the important relation 

MA^ = e,(A)e,(;u)MA.,^. . (1.9.3) 

From (1.9.3) and the positivity of M, we obtain the powerful Galois selection rule 

Mx^^O =^ e^{X) = e^iij) . (1.9.4) 

Next let us quickly sketch how these tools are used to obtain the Al classification. 
For details the reader should consult [29]. 

The first step will be to find all possible values of A such that Mqa 7^ or Mao 7^ 0. 
These A are severely constrained. We know two generic possibilities: Ai = (good for all 
k), and A = JO (good when -^ is even). We now ask the question, what other possibilities 
for A are there? Our goal is to prove (1.9.7). Assume A 7^ 0, JO, and write a = Ai + 1 and 
n = A; + 2. 

There are only two constraints on A which we will need. One is (1.9.1): 

(a-l)(a + l) = (mod4n) . (1.9.5) 

More useful is the Galois selection rule (1.9.4), which we can write as sin(7r£-) sin(7r£-) > 0, 
for all those i. But a product of sines can be rewritten as a difference of cosines, so 

cos(7r£^^) >cos(7r£^^) . (1.9.6) 

n n 
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(1.9.6) is strong and easy to solve; the reader should try to find her own argument. 
What we get is that, provided n 7^ 12, 30, M obeys the strong condition 

Mao ^ or Mqx ^0 =^ A G {0, JO} . (1.9.7) 

Consider first case 1: M\o = dx^. From above, we know M\^ = S^^t^x for some per- 
mutation TV of PJ! obeying 5'^^ = St^x,7t^- We know ttO = 0; put n := 7r(/c — 1, 1). Then 
sin(7r|-) = sin(7r ^^^ ), and so we get either n = {k — 1,1) or /j = J{k — 1, 1). By T- 
invariance (1.9.1), the second possibility can only occur if 4 = (n — 2)^ (mod 4n), i.e. 4 
divides n. But for those n, 2)2+1 is also a permutation matrix, so replacing M if necessary 
with the matrix product MVit^i, we can always require // = (/c — 1, 1), i.e. iv also fixes 
(k — 1,1). It is now easy to show n must fix any A, i.e. that M is the identity matrix An-i- 
The other possibility, case 2, is that both Mq^jo ^ and Mjo,o 7^ 0. (1.9.1) says 
1 = (n — 1)^ (mod 4n), i.e. ^ is odd. The argument here is similar to that of case 1, but 
with (k — 2, 2) playing the role of {k — 1,1). We can show that Af(fc-2,2),(fc-2,2) ¥" O5 except 
possibly for k = 16, where we find the exceptional S^. Otherwise we get M = Vm^i. 

For more general X^ ' level /c, the approach is 

(i) to look at all the constraints on the A G P^ for which Mqx 7^ or Mxo 7^ 0. Most 
important here are TM = MT (which will always be some sort of norm selection rule) 
and the Galois selection rule (1.9.4). Generically, what we will find is that such a A 

must equal JO for some simple-current J, as in (1.9.7) for A\ . 

(ii) Solve this generic case (in the A\ classification, these were the physical invariants 

A, V^ and S7). 

(iii) Solve the nongeneric case. The worst of these are the orthogonal algebras at /s = 2, 
as well as the places where conformal embeddings (see §1.7) occur. 

(ii) has recently been completed for all simple Xi, as has the k — 2 part of (iii). (i) is 
the main remaining task in the physical invariant classification for simple Xg. 

A natural question to ask is whether A-D-E has been observed in e.g. the A2 classifi- 
cation. The answer is no, although the fusion graph theory of Di Francesco-Petkova-Zuber 
[62] is an attempt to assign to these physical invariants graphs reminiscent of the A-D-E 
Dynkin diagrams. Also, there is related work trying to understand the A2 classification 
in terms of subgroups of SU3(C) (as opposed to SU2(C) for A\ ') — see e.g. [35]. Finding 
the A3 , A\ ,... classifications would permit the clarification and testing of this vaguely 
conjectured relation between the An physical invariants, and singularities C^^^ /G for G 
a finite subgroup of SUtj,+i(C). 

However, a few years ago Philippe Ruelle was walking in a library in Dublin. He 
spotted a yellow book in the math section, called Complex Multiplication by Lang. A 
strange title for a book by Lang! After all, there can't be all that much even Lang could 
really say about complex multiplication! Ruelle fiipped it to a random page, which turned 
out to be p. 26. On there he found what we would call the Galois selection rule for A2 , 
analysed and solved for the cases where k + 3 is coprime to 6. Lang however didn't 
know about physical invariants; he was reporting on work by Koblitz and Rohrlich on 
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decomposing the Jacobians of the Fermat curve x"^ + y"^ = z'^ into their prime pieces, 
caUed 'simple factors' in algebraic geometry, n here corresponds to k + 3. Similarly, 
Itzykson discovered traces of the A2 exceptionals — these occur when /c + 3 = 8, 12, 24 
— in the Jacobian of x^^ + y^^ = z'^'^. See [2] for further observations along these lines. 
These 'coincidences' are still far from understood. Nor is it known if, more generally, the 
A^ level k classification will somehow be related to the hypersurface x" + ■ ■ ■ + x^ = 2"^, 
for n = k + £ + 1. 

The («(!)©■ ■ ■(Bu{l))^^' classification has connections to rational points on Grassman- 
nians. The Grassmannian is (essentially) the moduli space for the Narain compactifications 
of the (classical) lattice string. It would be very interesting to interpret other large families 
of physical invariants as special points on other moduli spaces. 

These new connections relating various physical invariant classifications to other areas 
of math seem to indicate that although the physical invariant classifications are difficult, 
they could be well worth the effort and be of interest outside RCFT. Once the physi- 
cal invariant lists are obtained, we will still have the fascinating task of explaining and 
developing all these mysterious connections. These thoughts keep me going! 

Another motivation for completing these lists comes from their relation to subfactor 
theory in von Neumann algebras^^. These algebras (see e.g. [22]) can be thought of as 
symmetries of a (generally infinite) group. Their building blocks are called factors. Jones 
initiated the combinatorial study of subfactors N of M (i.e. inclusions N <0 M where 
M, A^ are factors), relating it to e.g. knots, and for this won a Fields medal in 1990. Jones 
assigned to each subfactor N C M a, numerical invariant called an 'index', a sort of (gen- 
erally irrational) ratio of dimensions. Graphs (called principal and dual principal) are also 
associated to subfactors. A much more refined subfactor invariant, called a 'paragroup', 
has been introduced by Ocneanu. It is essentially equivalent to a (2-1-1) -dimensional topo- 
logical field theory. Moreover, any RCFT can be assigned a paragroup, and any paragroup 
(via a process called asymptotic inclusion which is akin to Drinfeld's quantum doubling of 
Hopf algebras) yields an RCFT. See [22] for details. 

Bockenhauer-Evans [4] have recently developed this much further, and have clarified 
the fusion graph ^^ physical invariant relation. The fusion graphs will correspond to 
subfactor principal graphs. In the work of Di Francesco-Petkova-Zuber, that relation 
seems to be only empirical (i.e. nonconceptual) . 

Subfactor theory together with singularity theory is our best hope at present for 
understanding and generalising the A-D-E meta-pattern. 
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For reasons of necessity, in the foUoAving discussion I'll take more liberties than usual in the presentation. 
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Part 2. Monstrous Moonshine 

2.1. Introduction 

In 1978, John McKay made a very curious observation. One of the well-known^^ 
functions of classical number theory is the j-function^'^, given by 



= q~^ + 744 + 196 884 q + 21 493 760 g^ + 864 299 970 g^ + ■ ■ ■ (2.1.1) 

Here as elsewhere in this paper, q = exp[27rir]. Also, as{n) = Xldin'^^' ®^8 ^^ ^^^ theta 
function of the Eg root lattice, and r] is the Dedekind eta. What is important here are 
the values of the first few coefficients. What McKay noticed was that 196 884 ^ 196 883. 
Closer inspection shows 21 493 760 ^ 21 296 876, and 864 299 970 ^ 842 609 326. In fact, 

196 884 = 196 883 + 1 (2.1.2a) 

21 493 760 = 21 296 876 + 196 883 + 1 (2.1.26) 

864 299 970 = 842 609 326 + 21 296 876 + 2 • 196 883 + 2-1 (2.1.2c) 

The numbers on the right-side are the dimensions of the smallest irreducible representations 
of the Monster finite simple group M (in 1978 it still wasn't certain that M even existed so 
back then these numbers were merely conjectural). The same game could be played with 
other coefficients of the j-function. With numbers so large, it seemed to him doubtful that 
this numerology was merely a coincidence. On the other hand, it was hard to imagine any 
deep conceptual connection between the Monster and the j-function: they seem completely 
unrelated. 

In November 1978 he mailed the 'McKay equation' (2.1.2a) to John Thompson. At 
first Thompson dismissed this as nonsense, but after checking the next few coefficients he 
became convinced. He then added a vital piece to the puzzle. It should be well-known that 
when one sees a nonnegative integer, it often helps to try to interpret it as the dimension 
of some vector space. Essentially, that is what McKay was proposing here. (2.1.2) are 
really hinting that there is a 'graded' representation V of M: 

F = y_i © Fi © V2 © Vs © ■ • • 

where V-i = po, Vi = pi © po, V^2 = P2 © Pi © Po, V3 = p3 © p2 © pi © Pi © Po © Po, etc, 
where pi are the irreducible representations of M (ordered by dimension) , and that 

00 
j{t) - 744 = dimg(y) := dim(y_i) q'^ + ^ dim(V;) q' , (2.1.3) 

4=1 
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'Well-known' is math euphemism for 'a basic result of which until recently we were utterly ignorant.' As 

Conw^ay later said, "the j'-function v/as 'w^ell-known' to other people, but not 'well-known' to me." 

20 

This and other technical terms used in this introduction will be carefully explained in the following sub- 
sections. This section is merely offered as a quick overview. 

35 



the graded dimension of V. 

Thompson suggested that we twist dimg(l/), i.e. that more generally we consider the 
series (now called the McKay-Thompson series) 

CO 

Tgir) := chvjg) = chv^Ag) Q~' + J^chv-.l^) Q' , (2.1.4) 

for each element g G M. The point is that, for any group representation p, the character 
value chp{id.) equals the dimension of p, and so Tid.ij) = jij) — 744 and we recover 
(2.1.2) as special cases. But there are many other possible choices oi g E M. Thompson 
couldn't guess what these functions Tg would be, but he suggested that they too might 
be interesting. This is a nice thought: when we see a positive integer, we should try to 
interpret it as a dimension of a vector space; if there is a symmetry present, then it may act 
on the vector space — i.e. our vector space may carry a representation of that symmetry 
group — in which case we can apply the Thompson trick and see what if any significance 
the other character values have in our context. 

Conway and Norton [12] did precisely what Thompson asked. Conway called it "one 
of the most exciting moments in my life" [11] when he opened Jacobi's foundational (but 
150 year old!) book on elliptic and modular functions and found that the first few terms 
of the McKay-Thompson series agreed perfectly with the first few terms of certain spe- 
cial functions, namely the Hauptmoduls of various genus modular groups. Monstrous 
Moonshine was officially born. 

The word 'moonshine' here is English slang for 'unsubstantial or unreal'. It was 
chosen by Conway to convey as well the feeling that things here are dimly lit, and that 
Conway-Norton were 'distilling information illegally' from the Monster character table. 

In fact the first incarnation of Moonshine goes back to Andrew Ogg in 1975. He was in 
France describing his result that the primes p for which the group ro(p)+ has genus 0, are 

{2, 3, 5, 11, 13, 17, 19, 23, 29, 31, 41, 47, 59, 71}. To{p)+ is the group generated by ' ° ^ 



-p 

and Tq{j)), and is the normaliser of Tq{p) in SL2(M) (this sentence will make a little more 
sense after §2.3, but it isn't important here to understand it). He also attended a lecture 
by Jacques Tits, who was describing a newly conjectured simple group. When Tits wrote 
down the prime decomposition of the order of that group (see (2.2.1) below), Ogg noticed 
its prime factors precisely equalled his list of primes. Presumably as a joke, he offered a 
bottle of Jack Daniels' whisky to the ffrst person to explain the coincidence. 

The next step was accomplished by Griess in 1980, with the construction of the Mon- 
ster^^ M, and with it the proof that the conjectured character table for M was correct. 
Griess did this by explicitly constructing the 196883-dimensional representation pi; it turns 
out to have a (commutative nonassociative) algebra structure, now called the Griess al- 
gebra. Though this paper was clearly important, the construction was artiffcial and 100 
pages long: since the Monster is presumably a natural mathematical object (see §2.2), 
an elegant construction for it should exist. This was ultimately accomplished in the mid 
1980s with the construction by Frenkel-Lepowsky-Meurman [23] of the Moonshine module 
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Griess also came up with the symbol for the Monster; Conway came up with the name. 
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y'' and its interpretation by Borcherds as a vertex operator algebra. The Griess algebra 
appears naturally in V^, as we shall see. V^ does indeed seem to be a 'natural' mathemat- 
ical structure, and M is its automorphism group: in fact V'^ is the graded representation 
y of M conjectured by McKay and Thompson. 

Connections with physics (CFT) go back to Dixon-Ginsparg-Harvey [19] in 1988, in 
a paper titled "Beauty and the beast: Superconformal symmetry in a Monster module". 
The Moonshine module V^ can be interpreted as the string theory for a Z2-orbifold of 
free bosons compactified on the torus ]R^^/A24 (A24 is the Leech lattice). Many aspects 
of Moonshine make complete sense within CFT, but some (e.g. the genus zero property) 
remain more obscure. (Though in 1987 Moore speculated that the 0-genus of ro(a)-|- 
could be related to the vanishing of the cosmological constant in certain string theories 
related to M, and Tuite [57] related genus-zero with the conjectured uniqueness of V^.) 
Nevertheless this helps make the words of Dyson ring prophetic: "I have a sneaking hope, a 
hope unsupported by any facts or any evidence, that sometime in the twenty-first century 
physicists will stumble upon the Monster group, built in some unsuspected way into the 
structure of the universe" [21]. 

Finally, in 1992 Borcherds [5] completed the proof of the Conway-Norton conjectures 
by showing V^ is the desired representation V. The full conceptual relationship between 
the Monster and the Hauptmoduls (like j) seems to remain 'dimly lit', although much 
progress has been realised. This is a subject where it is much easier to conjecture than to 
prove, and we are still awash in unresolved conjectures. 

McKay also noticed in 1978 that similar coincidences hold if M and j(r) are replaced 
with the Lie group Es{C) and {qj{q))'^ = 1 + 248^ + ■ ■ ■. This turns out to be much 
easier to explain, and in 1980 both Kac and Lepowsky remarked that the unique level 1 
highest- weight representation of the afhne algebra E^ has graded dimension {qjiq))^. 

Moonshiners have a little chip on their shoulders. Modern math, they say, tends to be 
a little too infatuated with the pursuit of generalisations for generalisations' sake. Surely 
a noble goal for math is to find interesting and fundamentally new theorems. It can be 
argued that both history and common-sense suggest that to this end it is most profitable to 
look simultaneously at both exceptional structures and generic structures, to understand 
the special features of the former in the context of the latter, and to be led in this way to a 
new generation of exceptional and generic structures. Moonshiners would sympathise with 
those biologists who study the duck-billed platypus and lungfish rather than hide them in 
the closet as monsters: BECAUSE they appear to be unique, those animals presumably 
have much to teach us about our general understanding of evolution, etc. 

It often seems to people that Moonshine can't be very deep: the Conway-Norton 
conjectures seem to be so finite^^ and specialised. There only are 171 distinct McKay- 
Thompson series Tg in Monstrous Moonshine, after all. The whole point though is to try 
to understand why the Monster and the Hauptmoduls are so related, and then to try to 
extend and apply this understanding to other contexts. Moonshine is still young, and our 
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Indeed the Moonshine conjectures are finite (it is enough to check the first 1200 coefficients), and a slightly 
weaker form was quickly proved on a computer by Atkin, Fong and Smith [54]. However this sort of argument 
adds no light to Moonshine, and tells us nothing of V^ except that it exists. 
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understanding remains incomplete. But already math has benefitted: e.g. we now have a 
natural definition of M (as the automorphism group of V^), and Moonshine helped lead 
us to the rich structures of generalised Kac-Moody algebras and vertex operator algebras. 
We will see that Moonshine involves the interplay between exceptional structures such 
as the number 24, the Leech lattice A24, the Monster group M, and the Moonshine module 
V\ and generic structures such as modular functions, vertex operator algebras, generalised 
Kac-Moody algebras, and conformal field theories. The following sections will introduce 
the reader to many of these structures, as we use Moonshine as another happy excuse to 
take a second little tour through modern mathematics. 



2.2. Ingredient #1: Finite simple groups and the Monster 

A readable introduction to the basics of finite group representation theory is [25]. 
The finite simple groups are described in [33]; see also [11]. Group representations were 
introduced in §1.3. 

A normal subgroup H of a, group is one obeying gHg~^ = H for all g E G. These 
are important because the set G/ H of 'cosets' gH has a natural group structure precisely 
when H is normal. Every group has two trivial normal subgroups: itself and {1}. If 
these are the only normal subgroups, the group is called simple. It is conventional to 
regard the trivial group {1} as not simple (just as 1 is conventionally regarded as not 
prime) . An alternate definition of a (finite) simple group G is that li ip : G ^ H is any 
group homomorphism (i.e. structure-preserving map: ^p{gg') = ip{g)y:>{g')), then </? is either 
constant (i.e. (p{G) = {1}), or </? is one-to-one. 

The importance of simple groups is provided by the Jordan-Holder Theorem. By a 
'composition series' for a group G, we mean a nested sequence 

G = HoDHiDH2D---DHkD H^+i = {1} 

of groups such that Hi is normal in ifi-i, and Hi-i/Hi (called a 'composition factor') is 
simple. Any finite group G has at least one composition series. If Hq D ■ ■ ■ D H'^_^_-^ = {1} 
is a second composition series for G, then Jordan-Holder says that k = £ and, up to a 
reordering tt, the simple groups Hi-i/Hi and H'^^_^/ H'^^ are isomorphic. 

For example, the cyclic group Z^ of order (=size) n — you can think of it as the integers 
modulo n under addition — is simple iff n is prime. Consider the group Z12 = (1). Two 
composition series are 

Z12 D (2) D (4) D (0) 
Z12 D (3) D (6) D (0) 

corresponding to composition factors Z2, Z2, Z3, and Z3, Z2, Z2. Of course this is consis- 
tent with Jordan-Holder. This is reminiscent of the fact that 2 ■ 2 ■ 3 = 3 ■ 2 ■ 2 are both 
prime factorisations of 12. 

There is some value to regarding finite groups as a massive generalisation of the notion 
of number. The number n can be identified with the cyclic group Z^. The divisor of a 
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number corresponds to a normal subgroup, so a prime number corresponds to a simple 
group. The Jordan-Holder Theorem generalises the uniqueness of prime factorisations. 
That you can build up any number by multiplying primes, is generalised to building up 
a group by semi-direct products (more generally, by group extensions) : if i7 is a normal 
subgroup of G, then G will be an extension of H by the quotient group G/ H. 

Note however that Zg x Z2 and ©3 x Z2 — both different from Z12 — will also have 
Z2, Z2, Z3 as composition factors: unlike for numbers, 'multiplication' here does not give a 
unique answer. The semidirect product Z2X1Z2 can equal either Z4 or Z2 x Z2, depending 
on how the product is taken. More precisely, the notation GxG' means a group where 
every element can be written uniquely as a pair ((7, g'), forgEG and g' E G' , and where 
the group operation is ((7, g'){hj h') = (stuff, gh). 

Thus simple groups have an importance for group theory approximating what primes 
have for number theory. One of the greatest accomplishments of twentieth century math is 
surely the classification of the finite simple groups. (On the other hand, group extensions 
turn out to be technically quite difficult and leads one into group cohomology.) This work, 
completed in the early 1980s (although gaps are continually being discovered and filled in 
the arguments), runs to approximately 15 000 journal pages, spread over 500 individual 
papers, and is the work of a whole generation of group theorists. A modern revision is 
currently underway (see e.g. [34]) to simplify the proof and find and fill all gaps, but the 
final proof is still expected to be around 4000 pages long. The resulting list is: 

• the cyclic groups Zp {p a prime); 

• the alternating groups 21^ for n > 5; 

• 16 families of Lie type; 

• 26 sporadics. 

We've already met the cyclic groups. The alternating group 21^ consists of the even 
permutations in the symmetric group 6^, and so has order(=size) | n!. The groups of Lie 
type are essentially Lie groups defined over finite fields^"^ Fg (such as Zp), sometimes 
'twisted' in certain senses. The simplest example is PSL^(Fq), which consists of the 
n X n matrices with entries in Fg, with determinant 1, quotiented out by the centre of 
SL^(Fg) (namely the scalar matrices diag(a, a, . . . , a) for a" = 1) (except for PSL2(Z2) 
and PSL2(Z3), which aren't simple). 

Note that the determinant \p{g)\ for any representation p of any (noncyclic) simple 
group must be 1 , otherwise we would violate the homomorphism definition of simple group 
(try to see why). Also, the centre of any (noncyclic) simple group must be trivial (why?). 
The smallest noncyclic simple group is 2I5, with order 60. ^"^ It is the same as (isomorphic to) 
PSL2(Z5) and PSL2(F4), and can also be expressed as the group of all rotations (refiections 
have determinant —1 and so cannot belong to any simple group) of M^ that bring a regular 
icosahedron back to itself. 
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There is a finite field witli q elements, ifl^ 5 is a power of a prime. For each such q, there is only 1 field of 

that size. The field with prime p elements is the integers taken mod p. 

24 

This implies, incidentally, that if G and H are any two groups Avith the same order below 60, then they will 

have the same composition factors. 
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The smallest sporadic group is the Mathieu group Mn, order 7920, discovered in 
1861^^. The largest is the Monster M, conjectured by Fischer and Griess in 1973 and 
finally proved to exist by Griess in 1980. Its order is^^ 

||M|| = 2'^^ ■ 3^° ■ 5^ ■ 7^ • 11^ • 13^ ■ 17 • 19 ■ 23 ■ 29 ■ 31 • 41 ■ 47 ■ 59 ■ 71 ^ 8 X 10^^ . (2.2.1) 

20 of the 26 sporadics are involved in (i.e. are quotients of subgroups of) the Monster. 
Some relations among M, the Leech lattice A24 and the largest Mathieu group M24 are 
given in Chapters 10 and 29 of [13]. 

Moonshine hints at a tantalising connection between the classification of finite simple 
groups, and the classification of RCFTs discussed in Part 1. Speculates [23] (page xli): 
"One can certainly hope for a uniform description of the finite simple groups as automor- 
phism groups of certain vertex operator algebras — or conformal quantum field theories. 
If such a quantum field theory could somehow be attached a priori to a finite simple group, 
the classification of such theories, a problem of great current interest among string theo- 
rists, might some day be part of a new approach to the classification of the finite simple 
groups. On the other hand, can the known classification of the finite simple groups help 
in the classification of conformal field theories?" 



2.3. Ingredient #2: Modular functions and Hauptmoduls 

A readable introduction to some of the topics discussed in this section is [16,42,60]. 

We know from complex analysis that the group SL2(]R) of 2 x 2 matrices with real 
entries and determinant 1, acts on the upper-half plane 7Y = {r G C | Im(r) > 0} by 
fractional linear (or Mobius) transformations: 



a b\ ar + b 



c d ' CT + d 



(2.3.1) 



For example the matrix 5":= j 1 corresponds to the function r 1-^ — 1/t, while the 

matrix T:= I J corresponds to the translation r 1-^ r + 1. Since ± I p. ^ ) corre- 

spond to the same Mobius transformation, strictly speaking our group here is PSL2(]R) = 
SL2(M)/{±/}. 

The only reason this action (2.3.1) of the 2x2 matrices on complex numbers (or more 
precisely the Riemann sphere CU {00}) might not look strange to us, is because familiarity 
breeds numbness. What we really have is a natural action of n x n matrices on C"^, and this 
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Although his arguments apparently weren't very convincing. In fact some people, including the Jordan of 

Jordan-Holder fame, argued in later papers that the largest of Mathieu's sporadic groups couldn't exist. 

26 

The inquisitive reader, hungry for more 'coincidences', may have noticed that 196883 and 21296876 — see 

(2.1.2) — exactly divide the order of the Monster. Indeed this will hold for any finite group: the dimensions of 

the irreducible representations of a finite group Avill always divide its order. 
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induces their action on C^~^ (together with a codiniension-2 set of 'points at infinity') by 
interpreting C"^ as projective coordinates for C"^~^. Speciahsing to n = 2 gives us (2.3.1). 
In projective geometry, 'paraUel fines' intersect at oo. Projective coordinates afiow one to 
treat 'finite' and 'infinite' points on an equal footing. 

Consider F := SL2(Z), the subgroup of SL2(]R) consisting of the matrices with integer 
entries. It can be shown that it is generated by 5" and T (in other words, every matrix 
a G r can be expressed as a monomial in 5" and T) . For reasons that will be clear shortly, 
consider the extended upper-half plane H := HU {ioo} U Q — the extra points {ic)o} U Q 
are called cusps. F acts on H (e.g. S interchanges and ioo). By a modular function 
for F, we mean a meromorphic function / : 7i ^ C, symmetric with respect to F: i.e. 
/(a(r)) = /(r) for all a G F. Note that we require / to be meromorphic at the cusps (e.g. 
polynomials are meromorphic at ioo, but e^ is not). 

It is not obvious why modular functions should be interesting, but in fact they are one 
of the most fundamental notions in modern number theory (see the last paragraph of §1.6). 
For example, consider the question of writing numbers as sums of squares. We can write 
5 = 1^ + (-2)2 = (-1)2 + i2 _^ o2 + l2 + (-1)2, to give a couple of trivial examples. Let 
Nn{k) be the number of ways we can write the integer n as a sum of k squares, counting 
order and signs. For example A^5(l) = (since 5 is not a perfect square), A^5(2) = 8 (since 
5 = (d=l)2 + (±2)2 = (±2)2 ± (±1)2), Ar5(3) = 24, etc. Their generating functions are:27 

oo 
n=0 

where 

is called a theta function. It turns out that 6*3 transforms nicely with respect to F, once 

we make the change-of- variables q = exp[7rir]. This takes work to show. For example, 

/I 2\ 
6*3 is clearly invariant under the action of I J , and a little work (from e.g. Poisson 

summation) shows that I J takes 6*3 (r) to a/y ^3(t)- 6*3 is not precisely a modular 



function (it is a 'modular form of weight |' for Fo(4)), but this simple example illustrates 
the point that F (and related groups) appear throughout number theory. More on this 
shortly. 

That important change-of- variables q — exp[7rir] was introduced by Jacobi early last 
century, in his analysis of 'elliptic integrals'. The theory is beautiful and poorly remem- 
bered today, which is very disappointing considering how much of modern math was 
touched by it. I strongly recommend the book [10], written over a century ago; the style 
and motivation of math in our century is different from that in Jacobi's, and we've lost a 
little in motivation what we've gained in power. I'll briefiy sketch Jacobi's theory. 



27 . . 

A fundamental principle in math is: Avhenever you have a subscript with an infinite range, make a po^ver 
series (called a generating function) out of it. 
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Just as we could develop a theory of 'circular functions' (i.e. sine etc.) starting from 
the integral s{a) = J^ ^=t=, so can we develop a theory of 'elliptic functions' starting 

from the 'elliptic integral' F{k^a) = J^ , ^ Inverting s{a) gives a function 

both more useful and with nicer properties than s{a): we call it sin(w). Similarly, for any 
k the elliptic function sn(/c,w) is defined hj u = F(/c, sn(/c, -u)). Just as we can define a 
numerical constant n by sin(i7r) = 1 (i.e. ^n = J^ ,^^ ^ ) ? ^^ S^^ ^ function K{k) = 
Jg , ^^ ^ ^ . Just as sin(w) has period 4(i7r), so has sn w-period 4K{k). sn also 

Y(1— X )(1 — A: X ) 

turns out to have -u-period 4i K{k') where k' = Vl — k'^ — today we take this as the 
starting point and define an elliptic function to be doubly periodic (see [42] or Cohen in 
[60]). 

The theta functions aren't elliptic functions, but they are closely related, and e.g. sn 
can be written as a quotient of them. In Jacobi's language, we have 



The 'modular transformation' r i-^ — corresponds to interchanging the 'modulus' k with 
the 'complementary modulus' k' , and thus is completely natural in Jacobi's theory. The 
important formula ^3(^) = \/j Osif) is trivial here. 

A certain interpretation of modular functions also indicates their usefulness, and 
played an important role in Part 1. A torus is something that looks like the surface 
of a bagel, at least as far as its topology is concerned. For example, the Cartesian product 
5"^ X S^ of circles is a torus (think of one circle being the contact-circle of the bagel with 
the table on which it rests, then from each point on that horizontal circle imagine placing 
a vertical circle perpendicular to it, like a rib; together all these ribs fill out the bagel's 
surface) . A more sophisticated example of a torus is an elliptic curve (a complex curve of 
the form y'^ = ax^ + hx"^ -\- ex -\- d and a special point on it playing the role of 0). A final 
example is the quotient C/A of the complex plane C with a 2-dimensional lattice A (we 
saw lattices in §1.6; A here will be a discrete doubly-periodic set of points in C, containing 
0). It turns out that certain equivalence classes of tori (e.g. with respect to conformal 
or complex-analytic equivalence) always contain a representative torus of the form C/A, 
where A consists of all points Z + Z r, for some r G 7Y. (Incidentally, the cusps correspond 
to degenerate tori.) In other words, these equivalence classes are parametrized by complex 
numbers r in Ti. So if we have a complex-valued function F on the set of all tori, which is 
e.g. conformally invariant (an example is the genus-one partition function Z in conformal 
field theories — see §1.1), then we can consider F as a well-defined function F : Ti ^ C 
However, it turns out that different points r in 7Y correspond to the same equivalence class 
of tori: e.g. the lattice for r is the same as that for r + 1, and these are a rescaling of 
that for — 1/r. Thus F{t) = F{t + 1) = F( — 1/r), because r, r + 1, — 1/t all represent 
equivalent tori. Since r i-^ r + 1 and r ^— *> —1/r generate PSL2(Z), what in fact we find is 
that F has F as its group of symmetries. One often says that F is the 'modular group of 
the torus', and that the orbit space V\H is the 'moduli space' of (conformal equivalence 
classes of) tori. Ti is called its 'Teichmiiller space' or 'universal cover'. This is exactly 
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analogous to S^ = M/Z: M is its universal cover and Z is its 'modular group' (or 'mapping 
class group'). Another example: the Teichmiiller space for (conformal equivalence classes 
of) 'pair-of-pants', or equivalently a disc minus two open interior disks, is M^ (an ordered 
triple), while its modular group is the symmetric group ©3 and its moduli space consists 
of unordered triples. Incidentally, we write r\H instead of H/T because the group F acts 
on 7i 'on the left'. A good introduction to the geometry here is [55]. 

In any case, a surprising number of innocent-looking questions in number theory can be 
dragged (usually with effort) into the richly developed realm of elliptic curves and modular 
functions, and it is there they are often solved. For instance, we all know the ancient Greeks 
were interested in Pythagorean triples: find all integer solutions a, 6, c to a^ + 6^ = c^, 
i.e. find all integer (or if you prefer, rational) right-angle triangles. They solved this by 

2_ 2 2 

elementary means: choose any integers (or rationals) x,y and put u = ^2x^5 "^ = 2X^2 ; 

X -\-y X -\-y 

then u"^ + v'^ = 1 and (multiplying by the denominator) this gives all Pythagorean triples. 

There are two ways of extending this problem. One is to ask which n G Z can arise as 
areas of these rational right-angle triangles. It turns out n = 5 is the smallest one: a = |, 
6 = f , c = f works (5 = i(|)(f ) and (f )2 + (f )2 = (f )2). This is a hard problem — 
just try to show n = 1 cannot work, n = 157 turns out to work: the simplest triangle has a 
and b as quotients of integers of size around 10^^, and c as the quotient of integers around 
10^^. Although this problem was studied by the ancient Greeks and also by the Arabs in 
the 10th century, it was finally cracked in the 1980s. It was solved by first translating it 
into the question of whether the elliptic curve y"^ = x^ — n^x has infinitely-many rational 
points, and then applying all the rich 20th century machinery to answering that question. 

The other continuation of the Pythagorean triples question is more famous: find all 
integer solutions to a^ + b"' = c" (or equivalently all rational solutions to a"^ + b'^ = 1). 350 
years ago Fermat wrote in the margin of the book he was reading (the book was describing 
the Greek solution to Pythagorean triples) that he had found a "truly marvelous" proof 
that for n > 2 there are no nontrivial solutions, but that the margin was too narrow 
to contain it. This result came to be known as ' Fermat 's Last Theorem' ^^ and despite 
considerable effort no one has succeeded in rediscovering his proof. Most people today 
believe that Fermat soon realised his 'proof wasn't valid, otherwise he would have alluded 
to it in later letters. In any case, a very long and complicated proof was finally achieved in 
the 1990s: the 'Taniyama conjecture' says that a certain function associated to any elliptic 
curve over Q will be modular; if a^ + b"^ = c^ for some n > 2, then the elliptic curve 
y'^ = x^ + (a" — b^)x'^ — a^b"^ will violate the Taniyama conjecture; finally Wiles proved 
the Taniyama conjecture is true. 

To most mathematicians, the 'area-n problem' and 'Fermat's Last Theorem' are in- 
teresting only because they can be related to elliptic curves and modular forms — it's 
easy to ask hard questions in math, but most questions tend to be stale. Number the- 
ory is infatuated with modular stuff because (in increasing order of significance) (a) it's 
exceedingly rich, with lots of connections to other areas of math and math phys; (b) it's 



It was called his 'Last Theorem' because it was the last of his 48 margin notes to be proved by other 
mathematicians — another one is discussed in Section 1.8. The story of Fermat's Last Theorem is a fascinating 
one, but alas this footnote is too small to do it credit. See for instance the excellent book [53]. 
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a battleground on which many innocent-looking but hard-to-crack problems can be slain; 
and (c) last generation's number theorists also worked on modular stuff. 

In any case, modular functions turn out to be important for math (and mathematical 
physics) even though they may at first glance look artificial. Poincare explained how to 
study them. He said to look at the orbits of Ti with respect to V. For example, one orbit, 
hence one point in T\H, contains all cusps. We write this as T\H, and give it the natural 
topological structure (i.e. 2 points [r], [r'] G V\H are considered 'close' if the 2 sets Fr, Fr' 
nearly overlap). Note first that by applying T repeatedly, every point in Ti corresponds 
to a point in the vertical strip — | < Re(r) < ^ — in fact to a unique point in that strip, 
if we avoid the two edges. S is an inversion through the unit circle, so it permits us to 
restrict to those points in the vertical strip which are distance at least 1 from the origin. 
The resulting region R is called a fundamental region for F. Apart from the boundary of 
R, every F-orbit will intersect R in one and only one point. 

What should we do about the boundary? Well, the edge Re(r) = — | gets mapped 
by T to the edge Re(r) = i, so we should identify (=glue together) these. The result is a 
cylinder running off to infinity, with a strange lip at the bottom. S tells us how we should 
close that lip: identify ie and ie~ . This seals the bottom of the cylinder, so we get an 
infinitely tall cup with a strangely puckered base. In fact the top of this cup is also capped 
off, by the cusp ioo. So what we have (topologically speaking) is a sphere. It does not look 
like a smooth sphere, but in fact it inherits the smoothness of Ti. 

Incidentally, topological manifolds of dimension < 3 always have a unique compatible 
smooth structure. 'Topological structure' means you can speak of continuity or closeness, 
'smooth structure' means you can also do calculus. On the other hand, M^ has infinitely 
many smooth structures compatible with its topological structure; mysteriously, all other 
Euclidean spaces M" have a unique smooth structure! Thus both mathematics and physics 
single out 4-space. Coincidence??? 

So anyways, what this construction of T\H means is that a modular function can 
be reinterpreted as a meromorphic complex-valued function on this sphere. This is very 
useful, because our undergraduate complex variables class taught us all about meromorphic 
complex- valued functions / on the Riemann sphere C U cxd. There are many meromorphic 
functions on C, but to also be meromorphic at oo forces / to be rational, i.e. f{w) = 

some polynomial Piw) i • j_i i j. j.i t-> • i o 

i—r . I ^) ; , where w is the complex parameter on the Riemann sphere. So our 

some polynomial Q{w) ' l- i- i- 

modular function /(r) will simply be some rational function P/Q evaluated at the change- 
of-variables function w = c{t) which maps us from our sphere r\H to the Riemann sphere. 
There are many different choices for this function c{t), but the standard one is c{t) = j{t), 
the j-function of (2.1.1)^^. Thus, any modular function can be written as a rational 
function /(r) = P {j (t)) / Q {j (t)) in the j-function. Conversely, any such function will be 
modular. 

This is analogous to saying that any function g{x) periodic under x ^-^ x + 1 can be 
thought of as a function on the unit circle 5"^ C C evaluated at the change-of-variables 
function x ^— *> e^'^'^, and hence has a Fourier expansion J2n9ri exp[27rina;]. 
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Historically, j was the standard choice, but in Moonshine the preferred choice would be the function J = j 

— 744 with zero constant term. 
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We can generalise this argument. Consider a subgroup G of SL2(]R) which is both not 
too big, and not too small. 'Not too big' means it should be discrete, i.e. the matrices in 

G can only get so close to the identity matrix I | . To make sure G is 'not too small', 

it is enough to require that G contains some subgroup of the form 

ro(A^):={(^ ^ J gSL2(Z)|c = (mod A^)} , (2.3.2a) 

i.e. G must contain all matrices in F whose bottom-left entry is a multiple of A^. So G 
must contain T, for example. We will also be interested only in those G which obey 

J J J eG ^ teZ (2.3.26) 

i.e. the only translations in G are by integers. We will call a function / : 7Y — ^ C a modular 
function for G if it is meromorphic (including at the cusps Q U {ioo}), and if also / is 
symmetric with respect to G: f o a = f for all a E G. This implies we will be able to 
expand / as a Laurent series in q. We analyse this as before: look at the orbit space 
S — G\H; because G is not too big, E will be a (Riemann) surface; because G is not too 
small, E will be compact. 

The compact Riemann surfaces have been classified (up to homeomorphism — i.e. 
considering only topology as relevant), and are characterised by a number called the genus. 
Genus is a sphere, genus 1 is a torus, genus 2 is like two tori resting side-by-side, etc. 
For example, the surface of a wine glass, or a fork, is topologically a sphere, while a coffee 
cup and a key will (usually) be tori. Eye glasses with the lenses popped out is a 2-torus, 
while a ladder with n rungs on it has genus n — 1. 

We will call G 'genus (/' if its surface E has genus g. For example, G = Fo(2) and 
G = Fo(25) are both genus 0, while Fo(50) is genus 2 and Fo(24) is genus 3. Once again, 
we are interested here in the genus case. As before, this means that there is a change- 
of-variables function we'll denote Jq which has the property that it's a modular function 
for G, and all other modular functions for G can be written as a rational function in it. 
Because of (2.3.2), we can choose Jq to look like 

Jg{t) = q~^ + ai{G) q + a2{G) q^ ^ 

So Jg plays exactly the same role for G that J := j — 744 plays for F. Jq is called the 
Hauptmodul for G. (Incidentally for genus > 0, two generators, not one, are needed.) 
For example, Fo(2), Fo(13) and Fo(25) are all genus 0, with Hauptmoduls 

J2(r) = q-^ +276q- 2048 q'^ + 11202 q^ - 49152 q'^ + 184024 q^ + ■ ■ ■ (2.3.3) 
J^^(r)=q-^ -q + 2q'^ + q^ + 2q^-2q^ -2q^ -2q^ + q^ + ■■■ (2.3.4) 

J25(r) =q-'-q + q^ + q'- q'' - q^^ + q^^ + . . . (2.3.5) 

The smaller the modular group, the smaller the coefficients of the Hauptmodul. In this 
sense, the j-function is optimally bad among the Hauptmoduls: e.g. for it 023 ~ 10^^. 
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An obvious question is, how many genus groups ( equivalent ly, how many Haupt- 
moduls) are there? It turns out that To{p) is genus 0, for a prime p, iff p — 1 divides 
24. Thompson in 1980 proved that for any g, there are only finitely many genus g groups 
obeying our two conditions (2.3.2). In particular this means there are only finitely many 
Hauptmoduls. Over 600 Hauptmoduls with integer coefficients ai{G) are presently known. 



2.4. The Monstrous Moonshine Conjectures 

We are now ready to make precise the main conjecture of Conway and Norton [12]. 
(We should emphasise though that there have been several other conjectures, some of 
which turned out to be partially wrong.) 

They conjectured that for each element g of the Monster M, there is a Hauptmodul 

oo 

J9{r) = q-' + Y,an{9)q'' (2.4.1) 

n=l 

for a genus group Gg such that each coefficient an{g) is an integer, and for each n the 
map g i— *> an{g) is a character of M. They also conjectured that Gg contains ro(A^) as a 
normal subgroup, for some A^ depending on the order of g. 

Another way of saying this is that there exists an infinite-dimensional graded repre- 
sentation V = V-i ©0^1 Vn of M such that the McKay-Thompson series Tg{r) in (2.1.4) 
is a Hauptmodul. 

There are around 8 x 10^"^ elements to the Monster, so naively we may expect around 
8 X 10^'^ different Hauptmoduls Jg = Tg. However the character of a representation eval- 
uated at g and at hgh~^ will always be the same, so Jg = Jhgh-^- Hence the relevant 
quantity is the number of conjugacy classes, which for M is only 194. Moreover, a char- 
acter evaluated at g~^ will always be the complex conjugate of its value at g, but here all 
character values Xv„{9) are integers (according to the conjecture). Thus Jg = Jg-i- The 
total number of distinct Hauptmoduls Jg arising in Monstrous Moonshine turns out to be 
only 171. 

For example, if we choose g to be the identity, we recover Tid. = J. It turns out that 
there are precisely 2 different conjugacy classes of order 2 elements, one of them giving the 
Hauptmodul J2 in (2.3.3). Similarly for 13, but J25 doesn't correspond to any conjugacy 
class of M. 

Moonshine provides an explanation for a forgotten mystery of classical mathematics: 
why are the coefficients of the j-function positive integers? On the other hand, that they 
are integers has long been important to number theory (complex multiplication, class field 
theory — see e.g. [16]). 

There are lots of other less important conjectures. One which played a role in ul- 
timately proving the main conjecture involves the replication formulae. Conway-Norton 
want to think of the Hauptmoduls Jg as being intimately connected with M; if so, then 
the group structure of M should somehow directly relate different Jg. In particular, con- 
sider the power map g ^-^ g^. Now, it was well-known that j(r) has the property that 
j{pr) + j{-) + j(^-^) + • ■ • + j C^+P" ) equals a polynomial in j, for any prime p {sketch 
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of proof : it's a modular function for F, and hence equals a rational function of j; since its 
only poles will be at the cusps, the denominator polynomial must be trivial). Hence the 
same will hold for J. Explicitly we get 

J(2r) + J(^) + J(^4— ) = J^(^) - 2ai (2.4.2a) 



2' ' 2 
3' ' "' 3 



Ji^r) + J(^) + J(^4-^) + J(^^) = J^(r) - 3ai J(r) - Sas (2.4.26) 



where J(t) = ^^ oa;?^- Slightly more complicated formulas hold in fact for any composite 
n. Conway and Norton conjectured that these formulas have an analogue for the Moonshine 
functions Jg in (2.4.1). In particular, (2.4.2) become for any (7 G M 

J,.(2r) + J,(^) + U^) = Jgir) - '2a,{g) (2.4.3a) 

J,3(3r) + J,(^) + U^) + U^) = Jl(r) - 3ai(<7) Jg{r) - 3a2(<7). (2.4.36) 

These are examples of the replication formulae. 

'Replication' concerns the power map g ^^ g"^ iiv M. Can Moonshine see more of the 
group structure of M? A step in this direction was made by Norton [47] , who associated a 
Hauptmodul to commuting elements g, h in M. Physically [19], this corresponds to orbifold 
traces, i.e. the V^ RCFT with boundary conditions twisted by g and h in the 'time' and 
'space' directions. Still, we would like to see more of M in Moonshine. 

An important part of the Monstrous Moonshine conjectures came a few years after [12]. 
Prenkel-Lepowsky-Meurman [23] constructed a graded infinite-dimensional representation 
V^ of M and conjectured (correctly) that it is the representation in (2.1.4). V^ has a very 
rich algebraic structure, which will be discussed in §2.6. 

A major claim of [23] was that V^ is a 'natural' structure (hence their notation). 
To see what they mean by that, it's best to view another simpler example of a natural 
construction: that of the Leech lattice A24. Recall the discussion of (root) lattices in §1.6. 

A24 is one of the most interesting lattices, and is related to Moonshine. It can be 
defined using 'laminated lattices'. Start with the 0-dimensional lattice Aq = {0}, which 
consists of just a single point. Use it to construct a 1-dimensional lattice, with minimal 
(nonzero) norm 4, built out of infinitely many copies of Aq laid side by side. The result 
of course is simply the even integers 2Z, which we will call here Ai. Now construct a 2- 
dimensional lattice, of minimum norm 4, built out of infinitely many copies of Ai laid next 
to each other. There are lots of ways to do this, but choose the densest lattice possible. 
The result is unique: it is the hexagonal lattice A2 scaled by a factor of \/2: call it A2. 
Continue in this way: A3, A4, A5, Ag, A7, and Ag will be the root lattices A3, D4, D5, 
Eq, E7 and Eg, respectively, all scaled by \/2. See [13] chapter 6 for a more complete 
treatment of laminated lattices. 

The 24th repetition of this construction yields the Leech lattice. It is the unique 
24-dimensional self-dual lattice with no norm-2 vectors, and provides among other things 
the densest known packing of 23-dimensional spheres in M^^. Many of its properties are 
discussed throughout [13]. So lamination provides us with a sort of no-input construction 
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of the Leech lattice, and a good example of the mathematical meaning of 'natural'. After 
dimension 24, it seems chaos results from the lamination procedure (there are 23 different 
25-dimensional lattices that have an equal right to be called A25, and over 75 000 are 
expected for A26)- 

It is natural to ask about Moonshine for other groups. There is a partial Moonshine 
for the Mathieu groups M24 and M12 (which have about 2 x 10^ and 10^ elements resp.), 
the automorphism group .0 of A24 (which has about 8 x 10^^ elements), and a few others — 
see e.g. [49]. These groups are either simple or almost simple (e.g. .0 is the direct product 
of Z2 with the simple group .1). More generally, there will be some sort of Moonshine for 
any group which is the automorphism group of a vertex operator algebra; the finite simple 
groups of Lie type should be automorphism groups of VOAs closely related to the affine 
algebras except defined over fields like Zp. 

There is a geometric side to Moonshine, associated to names like Lian-Yau and 
Hirzebruch. In particular, Hirzebruch's 'prize question' asks for the construction of a 
24-dimensional manifold on which M acts, whose twisted elliptic genus are the McKay- 
Thompson series. This is still open. 

It should be emphasised that Monstrous Moonshine is a completely unexpected con- 
nection between finite groups and modular functions. Although there has been enormous 
progress in our understanding of this connection (so much so, that Richard Borcherds 
won the 1998 Fields medal for his work on this), there still is mystery at its heart. In 
particular, that M is associated with modular functions can be explained mathematically 
by it being the automorphism group of the Moonshine VOA V^^ and physically by the 
associated RCFT, but what is so special about M that these modular functions should be 
genus 0? We will come back to this in §2.9. 

2.5. Formal Power Series 

Vertex (operator) algebras (VOAs) are a mathematically precise formulation of the 
notion of W-algebra or chiral algehra^^ which is so central to conformal field theory (see 
§1.1). VOAs were first defined by Borcherds, and their theory has since been developed 
by a number of people (Frenkel, Lepowsky, Meurman, Zhu, Dong, Li, Mason, Huang, ...). 
Because our primary motivation here is Moonshine, I will only focus on one aspect of their 
theory (the connection with Lie algebras). Useful to consult while reading this review are 
the notes [27] — they take a more analytic approach to many of the things we discuss, and 
their approach (namely that of CFT) motivates beautifully much of VOA theory. 

In quantum field theory the basic object is the quantum field, which roughly speaking 
is a choice of operator A{x) at each space-time point x. 'Operator' means something 
that 'operates on' functions or vectors. E.g. an indefinite integral is an operator, as is a 
derivative. The operators in the QFT act on the space spanned by the states |*), and 
together form an infinite-dimensional vector space (e.g. a C* algebra) — this infinite- 
dimensionality of QFT is a major source of its mathematical difficulties, and QFT still has 
not been put on completely satisfactory mathematical grounds. 



30 

An alternate (and much more complicated) mathematical formulation of chiral algebra is due to Beilinson 

and Drinfeld, and belongs to algebraic geometry. See [28] for a good — but still difficult — review. 
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But another difficulty is tliat tlie quantum field A really isn't an operator-valued 
function of space-time. 'Function' is too narrow a concept. For example, one of the most 
familiar 'functions' in quantum mechanics is the Dirac delta d{x). You see it for example in 
the canonical commutation relations: e.g. for a scalar field if, we have [0{x, t), ^<^(?/, t)] = 
i/i5'^(a^ — y). d{x) has the property that for any other smooth function /, 



f{y)S{y)dy = f{0), [ f (y) S' (y) dy = f (0) , 

■1 J-i 

etc. The problem is that S{x) isn't a function — no function could possibly have those 
properties. 

One way to make sense of 'functions' like the Dirac delta and its derivatives is distribu- 
tion theory. Although it was first informally used in physics, it was rigourously developed 
around 1950 by Laurent Schwartz, and uses the idea of test functions. See e.g. [15]. 

What I will describe now is an alternate approach, algebraic as opposed to analytic. 
These two approaches are not equivalent: you can do some things in one approach which 
you can't do in the other. But the algebraic approach is considerably simpler technically 
— no calculus or convergence to worry about — and it is remarkable how much can still 
be captured. This approach is the starting point for the VOA story described next section, 
and was first created around 1980 by Garland and Date-Kashiwara-Miwa. Keep in mind 
that what we are trying to capture is an operator-valued 'function' on space-time. Space- 
time in CFT is 2-dimensional, and so we can think of it (at least locally) as being on the 
complex plane C (more precisely, we will usually associate the space-time point (x, t) with 
the complex number z = e*"*"^^). Good introductions to the material in this section are 
[23,39,31]. 

Let W be any vector space. We are most interested in it being an infinite-dimensional 
space of matrices (i.e. operators on an infinite-dimensional space), but forget that for now. 
Define VF[[2, 2~^]] to be the set of all formal series Xl^-oo "^"^-s", where the coefficients 
Wn lie in our space W. We don't ask here whether a given series converges or diverges — z 
is merely a formal variable. We will also be interested in W[z, z~^] (Laurent polynomials). 
We can add these formal series in the usual way, and multiply them by numbers (scalars) 
in the usual way. 

Remember our ultimate aim here: we want to capture quantum fields. So we want 
our formal series to be operator-valued. The way to accomplish this is to choose W to 
be a vector space of operators, or matrices if you prefer. A fancy way to say this is 
'VF = End(y)', which means the things in W operate on vectors in V. If we take V = C"^, 
then we can think of W as being the space of all mxm complex matrices. We are ultimately 
interested in the case to = oo, but we won't lose much now by taking m = 1, which would 
mean formal power series with numerical coefficients. 

Because our coefficients Wn are operators, we can multiply our formal series. We 
define multiplication in the usual way. For example, consider W = V = C, and take 
c{z) = ^21 _ 5^100 ^^^ ^(^) _ E^=-oo -2"- Then 

OO CO oo oo 

c{z) d{z) = J2 -^"^^^ -^ Yl -^"^'"^ = Yl -^^ " ^ Yl -^^ = ~4^('^) ■ 
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So far so good. Now try to compute the square d{z)'^. You get infinity. So the lesson is: 
you can't always multiply in W[[z, z~^]]. We'll come back to this later. 

But first, look again at that first product: c{z) d{z) — —4d{z). One thing it tells us 
is that we can't always divide (certainly c{z) and —4 are two very different power series!). 
But there's another lesson here: if you work out a few more multiplications of this kind, 
what you'll find is that f{z) d{z) = /(I) d{z) for any /, at least for those / for which /(I) 
exists (e.g. any / G W[z, z~^]). Thus d{z) is what we would call the Dirac delta d{z — 1)! 
(You can think of it as the Fourier expansion of the Dirac delta, followed by a change of 
variables). Unfortunately, the standard notation here is to write it without the '—1': 

oo 

6{z) := Y. z- 

n^ — oo 

and that is the notation we will also adopt. Similarly, d{az) and 6'{z) etc (which are the 
formal series defined in the obvious way) act on W[z,z~^] in the way one would expect: 
f{z)6{az) = f{^)d{az) and f{z)d'{z) = f'{l)d'{z). So of course it makes perfect sense 
that we couldn't work out d{z)'^: we were trying to square the Dirac delta, which we know 
is impossible! 

A similar theory can be developed for several variables Zi, with identities such as 
f{zi, Z2) 8{zi/z2) = f{z2, Z2) 5{zi/z2) = fizi.zi) 5{zi/z2). 

But we must not get too overconfident: 

Paradox 1. Consider the following product: 

00 00 00 

6{z) = [{J2 n (1 - z)] S{z) = {J2 n [(1 - z) 5{z)\ = {J2 n [0 S{z)] = . 

71=0 n=0 n=0 

When physicists are confronted with 'paradoxes' such as this, they tend to respond 
by keeping them in the back of their mind, by treading with care when they are involved 
in a calculation which reminds them of one of the paradoxes, and otherwise trusting their 
instincts. Mathematicians typically over-react: they kick themselves for getting overcon- 
fident and walking head-first into a 'paradox', and then they devise some rule which will 
absolutely guarantee that that paradox will always be safely avoided in the future. We will 
follow the mathematicians' approach, and in the next few paragraphs will describe their 
rule for avoiding Paradox 1: to forbid certain innocent-looking products. 

Remember that we are actually interested in the vector space W = End(y). Suppose 
we have infinitely many matrices Wi G End(y). We will call them summable if for every 
column vector v & V, only finitely many products Wi{v) G V are different from 0. In other 
words, only finitely many of the matrices Wi have a nonzero first column, only finitely 
many have a nonzero second column, .... 

We will certainly have a well-defined sum J2i Wiiz) if for each fixed n, the set {wi{n)} 
(as i varies) of matrices is summable. All other sums are forbidden. We will certainly have a 
well-defined product^^ Y[^iWi{z) iifor e3ich.n,th.eset {wi{ni)w2{n2) ••■ w'm(^m)}y^„.=„ 
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m here will be finite: we permit infinite sums but only finite products. 
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(vary the n^ subject to J2i ^i = ^) is summable. All other products are forbidden. This 
is reasonable because the sum of those matrix products wi{ni) ■ ■ ■ Wminm) will precisely 
equal the nth coefficient of the product YliLi ''^ii^)- 

Note that there are certainly more general ways to have a well-defined product (or 
sum). For example, according to our rule, we cannot even add Xln^"^' This way has 
the advantage of not touching the more complicated realm of convergence issues. We are 
doing algebra here, not analysis. The way out of Paradox 1 is that C^z'^){l — z) doesn't 
equal 1 — rather, it's a forbidden product. 

An interesting consequence of the fact that we are doing algebra instead of analysis 
is that the product z2S{z) here does not and cannot equal l2d{z) = d{z) — their formal 
power series are very different. In hindsight this 'failing' is understandable: algebraically, 
it seems artificial to prefer the positive root of 1 over the negative root. 

Paradox 2. Expand j^ in a formal power series in z to get X]n>0'^"- Next, expand 

j^ = j^^^„i in a formal power series in z~^ to get —^n<o^^- Subtract these; we 
presumably should get 0, but we actually get d{z) ! 

The analytic explanation is that the first expression converges only for \z\ < 1, and 
the second for 1^1 > 1, so it would be naive to expect their difference to be 0. We see from 
this 'paradox' that it really matters in which variable we expand rational functions. For 
instance, at first glance the identity 

.1 l'zi-Z2\ _i (Z2-Zi\ _i^(Zi-Zo 

Zn 5 ] - Zn ] = Zn 6 ' 



is nonsense; it only holds if you expand the terms in positive powers of Z2, zi, and zq 
respectively. The procedure of expanding a function in positive and negative powers of a 
variable and then subtracting the results, yields what are called expansions of zero; it is 
possible to show that expansions of zero will always be linear combinations of Dirac deltas 
5{az) and their various derivatives 5^^'{az)^ as we saw in Paradox 2. 



2.6. Ingredient #3: Vertex Operator Algebras 

We are now prepared to introduce the important new structure called vertex oper- 
ator algebras (VOAs). They are essentially the chiral algebras of RCFTs — see [26,27] 
for excellent motivation of the 7 axioms below. A more detailed treatment of the basic 
theory of VOAs is provided by e.g. [23,39,31]. Although VOAs are natural from the CFT 
perspective and appear to be an important and rapidly developing area in math, their 
definition is not easy: Borcherds is known to have said that you either know what they 
are, or you don't want to know. 

A VOA is a (infinite-dimensional) graded vector space V = (Bnei'^n with infinitely 
many bilinear products u*nV respecting the grading (in particular Vk *n Vg C Vk+g-n-i), 
which obey infinitely many constraints. 'Bilinear' means that for any a, a', b,b' & C and 
u,u' ,VjV' G V, {au + a'u')*n{bv + b'v') = abu*nV + ab' u*nv' + a'bu'*nV + a'b' u'*nv' — i.e. 
that the products are compatible with the vector space structure of V. The subspaces Vn 
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must all be finite- dimensional, and they must be trivial (i.e. V^ = {0}) for all sufficiently 
small n (i.e. for n ~ — oo). Note that we can collect all these products into one generating 
function: a linear map Y : V ^ {EndV)[[z, z~^]]. That is, to each vector u & V we 
associate the formal power series (called a vertex operator) Y{u, z) = Ylinei^'^'^~^~^ ■ -^^^ 
each w, the coefficients Un will be functions from V to V . The idea is that the product 
u *n V will now be written UnV := Un{v). The bilinearity of *n translates into two things 
in this new language: that Y{'k, z) is linear, and that each function Un is itself linear (i.e. 
they are endomorphisms) . 
The constraints are: 

VOA 1. (regularity) u^v = for all n > N{uj v); 

VOA 2. (vacuum) there is a vector 1 E V such that 1^(1, z) is the identity (i.e. InV — dn,-iv); 

VOA 3. (state-field correspondence) Y(u, 0)1 = u; 

VOA 4. (conformat) there is a vector uj E V, called the conformal vector, such that L„ := cu^+i 

gives us a representation of the Virasoro algebra V, with central term C ^—^ cl for some 

ce C; 
VOA 5. (translation generator) Y(L-iu, z) = -j^Y(u,z); 
VOA 6. (conformal weight) Lqu = nu whenever u E Vn', 
VOA 7. (locality) (z - w)^[Y(u, z),Y(v, w)] = for some integer M = M(u, v). 

We saw the Virasoro algebra in Part 1 (see (1.2.7)). The number c in VOA 4 is 
called the central charge, and is an important invariant of V . The peculiar-looking VOA 
7 simply says that the commutator \Y(u,z),Y(v,w)] of two vertex operators will be a 
finite linear combination of derivatives of various orders of the Dirac delta centred at 
z — w. A recommended exercise for the reader is to show that M = 4 works in VOA 7 
for u = V = uj. Note that in a VOA, any Y(u, z)v will be a finite sum — i.e. the series 
Y(u, z) is summable (defined last section). It is a consequence of the axioms that 1 G Vq 
and UJ E V2: for instance, VOA 7 says all w^l = for any n > 0, so LqI = wil = and 
hence 1 E Vq. 

In RCFT, V would be the 'Hilbert space of states' (more carefully, V will be a dense 
subspace of it), and z = e*"*"^^ would be a local complex coordinate on a Riemann surface. 
Lq generates time translations, and so its eigenvalues (the conformal weights) can be 
identified with energy. Physically, the requirement that Vn ^ for n — > — 00 corresponds 
to the energy of the RCFT being bounded from below. Also, ^ = in VOA 3 corresponds 
to the time limit t -^ —00. For each state u, the vertex operator Y(u, z) is a holomorphic 
(chiral) quantum field. The vector 1 is the vacuum |0), and Y(uj,z) is the stress-energy 
tensor T. The most important axiom, VOA 7, says that vertex operators commute up to a 
possible pole aX z = w, and so are local quantum fields. It is equivalent to the duality axiom 
of many treatments of CFT. In the physics literature, there is a minor notational difference: 
for u E Vfc, Y(u,z) = YliUnZ~^~^ is written '^U[ri)Z~'^~^ . (Physicists prefer this because 
it cleans up some formulas a little; mathematicians abhor it because it artificially prefers 
the 'homogeneous' vectors u eV^-) 

In Segal's language (see §1.1), Y(u,z) appears quite naturally. Consider the physical 
event of two strings combining to form a third. To first order (i.e. the tree-level Feyn- 
man diagram), this would correspond in Segal's language to a 'pair-of-pants', or a sphere 
with three punctures, two of which are positively oriented (corresponding to the incoming 
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strings) and the other being negatively oriented. We can think of the sphere as the Rie- 
mann sphere C U {oo}; put the punctures at oo (outgoing) and z and (the incomings). 
Segal's functor T will associate to this a ^-dependent homomorphism ip^ '■ V x V ^ V. 
We write Lpziu^v) G F as Y{u,z)v. Incidentally, the symbol 'y was chosen because of 
this 'pair-of-pants' picture (time flows from the top of the 'y to the bottom), as was the 
name 'vertex operator'. 

The original axioms by Borcherds were a little more complicated and general: he 
didn't require dim(y^) < oo nor the Vn, — > condition, and he only considered Lq and 
L-i rather than the full Virasoro algebra. The resulting generalisation is called a vertex 
algebra. 

VOA 7 can be rewritten in the form (usually called the Jacobi identity for the VOA) 

^-i5(^i^^) Y{u, zi) Y{v, Z2) - Zo'6{^^^^) Y{v, Z2) Y{u, zi) (2.6.1) 

zq —zq 

^z-^S{^^^^)Y{Y{u,zo)v,Z2), 

Z2 

where the formal series are expanded in the appropriate way. This is the embodiment of 
commutativity and associativity in the VOA, as we will see. To bring it into a more useful 
form, hit it with t G V and expand out into ZqZ^^z"^: we obtain 

^(-1)* ( . j {Ug+rn-i ° Vn+i - (-l)V+n-i O ^m+i) = X] ( • ) {ue+iV)m+n-i , (2.6.2) 
i>Q ^ ^ i>0 

where for any k E Z^ j E Z>, ( • ) := —^ — - — ■ For instance, specialising (2.6.2) to 

£ = and m = 0, resp., gives us 

[Um,Vn] = X^ ( • ) {UiV)m+n-i (2.6.3) 

i>0 

{uev)n = ^(-1)' ( . j {u£-i o Vn+i - {-lYv£+ri-i ° Ui) . (2.6.4) 

i>0 ^ ^ 

Why is (2.6.1) called the Jacobi identity? Put i = m = n = Om (2.6.2): we get 
uoivot) — vo{uot) = {uov)ot. If we now formally write [xy] := x^y, then this becomes 
[w[t't]] — [I'i'ut]] = [[wi'jt], which is one of the forms of the Lie algebra Jacobi identity 
(1.2.1b). Even though [xy] ^ —[yx] here, this formal little trick will turn out to be quite 
important next section. 

The simplest examples of VOAs correspond to any even positive-definite lattice A; 
for their construction see e.g. [27,39]. Physically, they correspond to a bosonic string 
compactified on the torus R'^/A = S^ x- • -xS^ (where n is the dimension of A); the central 
charge c = n. Other important examples, first constructed by Frenkel-Zhu (again see e.g. 

[27,39]), correspond to afRne Kac-Moody algebras X^ at level k E Z>, and physically 
to WZW theories on simply-connected compact group manifolds. (We discussed afiine 
algebras in §1.4.) These have central charge c = — ^ ™^v ^ ■ 
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In 1984 Frenkel-Lepowsky-Meurman [23] constructed the Moonshine module V^. It 
is a VOA with c = 24, with V"^ = V^ © V} © ^2^ © ■ ■ -, where V^ = CI is 1-dimensional, 
Vi = {0} is trivial, and V2 = (Ccu) © (Griess algebra) is (1 + 196883)-dimensional. Its 
automorphism group (=symmetry group) is precisely the Monster M. Each graded piece 
V^ is a finite-dimensional representation of M; Borcherds proved that in fact V^ is the 
McKay-Thompson infinite- dimensional representation of M. It can be regarded as the 
most natural representation of M — it is rather surprising that important aspects of a 
finite group need to be studied via an infinite-dimensional representation. 

V^ has an elegant physical interpretation. First construct the bosonic string on 
]R^'^/A24 (recall that A24 is the Leech lattice). The resulting c = 24 VOA has parti- 
tion function (=graded dimension) J{t) + 24, but although its graded pieces (at least for 
n > 0) have the right dimensions, they don't carry a natural representation of M and 
so can't qualify for the McKay-Thompson representation. To get V^, orbifold this A24 
VOA by the order-2 automorphism of A24 sending x 1— *> —x. V^ thus corresponds to a 
holomorphic c = 24 RCFT, and Moonshine is related to physics. Most of Moonshine can 
be interpreted physically, except perhaps the genus property of the McKay-Thompson 
series Tg. 

There is a formal parallel between e.g. lattices and VOAs. For example, the Leech 
lattice A24 and the Moonshine module V^ play analogous roles: A24 is the unique even 
lattice which (i) is self-dual, (ii) contains no norm 2 vectors, and (iii) has dimension 24; V^ 
is believed to be the unique VOA which (i) possesses only one irreducible representation 
(namely itself), (ii) contains no conformal weight 1 elements, and (iii) has central charge 
c = 24. Analogies of these kinds are always useful as they suggest new directions to 
explore, and the history of math blooms with them. The battlecry 'Why invent what can 
be profitably copied' is not only heard in Hollywood. 

We will end this section on a more speculative note. Witten (1986) said that to un- 
derstand string theory conceptually, we need a new analogue of Riemannian geometry. 
Huang (1997) has pushed this thought a little further, saying that there is a more clas- 
sical 'particle-math' and a more modern 'string- math'. According to Huang we have the 
real numbers (particle physics) vrs the complex numbers (string theory); Lie algebras vrs 
VOAs; and the representation theory of Lie algebras vrs RCFT, etc. What are the stringy 
analogues of calculus, ordinary differential equations, Riemannian manifolds, the Atiyah- 
Singer Index theorem,...? At present these are all unknown. However, Huang suggests 
that just as we could imagine Moonshine as a mystery which is explained in some way 
by RCFT, perhaps the stringy version of calculus would similarly explain the mystery of 
2-dimensional gravity, stringy ODEs would explain the mystery of infinite-dimensional in- 
tegrable systems, stringy Riemannian manifolds would help explain the mystery of mirror 
symmetry, and the stringy index theorem would help explain the elliptic genus. 



2.7. Ingredient #4: Generalised Kac-Moody algebras 

In this section we investigate Lie algebras arising from VOAs. These Lie algebras are 
an interesting generalisation of Kac-Moody algebras. See e.g. [5,6,38 Chapter 11.13, 29]. 
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Much of Lie theory (indeed much of algebra) is developed by analogy with simple 
properties of integers. In §2.2 I invited you to think of a finite group as a massive general- 
isation of the concept of whole number. Specifically, the number n can be identified with 
the cyclic group Z^ with n elements. A divisor d of n generalises to a normal subgroup 
of a group. A prime number then corresponds to a simple group. Multiplying numbers 
corresponds to taking the semidirect product of groups (more generally, taking extensions 
of groups). Then we find that every group has a unique set of simple building blocks 
(although unlike numbers, different groups can have the same list of building blocks). 

For a finite-dimensional Lie algebra, a divisor is called an ideal; a prime is called 
simple; and multiplying corresponds to semidirect sum. Lie algebras behave simpler than 
groups but not as simple as numbers, and the analogy sketched above is a reasonably 
satisfactory one. In particular, simple Lie algebras are important for similar reasons that 
simple groups are, and as mentioned in §1.2 can also be classified (with much less effort). 
A good treatment of this important classification (over C) is provided by [36] . The proof 
is now reaching the state of perfection of the formulation of classical mechanics. One 
unobvious discovery is that the best way to capture the structure of a simple Lie algebra is 
by an integer matrix, called the Cartan matrix, or equivalently but more effectively (since 
most entries in the Cartan matrix are O's) by using a graph called the (Coxeter-)Dynkin 
diagram. For instance the Dynkin diagram for Ag consists of £ nodes connected sequentially 
in a line. See Figure 6 in [59]. 

More precisely, define a symmetrised Cartan matrix to be a symmetric real matrix 
A = {aij)ij<i such that aij < if i 7^ j, an > 0, each 2^^ G Z, and A is positive-definite. 
Examples of 2 x 2 symmetrised Cartan matrices are^^ 

2 -l\ /^2 0\ / 1 -l\ f 2 -3 

-1 2 yl ' 1^0 2yl ' \^-l 2 J ' \-3 6 

The Dynkin diagram corresponding to A consists of i nodes; the ith and jth nodes are 
connected with Aal^/auajj lines, and if an 7^ ajj, then we put an arrow over those lines 
pointing to i if an < ajj. The Dynkin diagrams corresponding to those four Cartan 
matrices are respectively 

0—0 , 00 , 0^0 , O^D 

We may without loss of generality require A to be indecomposable, or equivalently that the 
Dynkin diagram be connected. Of the 4 given above, only the second is decomposable. 

To any i x i symmetrisable Cartan matrix, we can construct the corresponding Lie 
algebra q in the following way. For each i, create 3 generators e^, fi, hi (so there are a total 
of 3£ generators) . The relations these generators obey are given by the following brackets: 
[eifj] = Sijhi, [hiCj] = ttijCj, [hifj] = -aijfj, and for i ^ j ad(ei)''ej = ad(/i)''/j = 
where n = 1 — 2^. By 'ad(e)' here I mean the function 0^0 defined by ad(e)/ = [e/]. 
So ad(e)2/ = [e[e/]], ad(e)3/ = [e[e[e/]]], etc. 
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Note that our Cartan matrices difTer from the usual definition, in which every diagonal entry equals 2. 
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To get a better feeling for these relations, consider a fixed i. The generators e = 
' ^ ^i-: f — \i ^ fi-'^ ~ ■^ ^« obey the relations (1.2.2b). In other words, every node 
in the Dynkin diagram corresponds to a copy of the Ax Lie algebra. The lines connecting 
these nodes tells how these i copies of Ax intertwine. 

For instance consider the first Cartan matrix given above. It corresponds to the Lie 
algebra A2, or sl3(C). The two Ax subalgebras which generate it (corresponding to the 2 
nodes of the Dynkin diagram) can be chosen to be the trace-zero matrices of the form 



• • 0\ 


/O 


• • , 


^ -k -k 


0/ 


\ • • 



It can be shown that the Lie algebra corresponding to an indecomposable sym- 
metrised Cartan matrix will be finite-dimensional and simple, and conversely that any 
finite- dimensional simple Lie algebra corresponds to an indecomposable symmetrisable 
Cartan matrix in this way. 

A confusion sometimes arises between the terms 'generators' and 'basis'. Both gener- 
ators and basis vectors build up the whole algebra; the difference lies in which operations 
you are permitted to use. For a basis, you are only allowed to use linear combinations 
(i.e. addition of vectors and multiplication by numbers), while for generators you are also 
permitted multiplication of vectors (or the bracket, in the Lie case). 'Dimension' refers to 
basis, while 'rank' usually refers in some way to generators. For instance the (commutative 
associative) algebra of polynomials in one variable x is infinite-dimensional — any basis 
needs infinitely many vectors. However, the single polynomial x is enough to generate it 
(so we could say that its rank is 1). Although those Lie algebras have 3£ generators, their 
dimensions in general will be greater. 

From the point of view of generators and relations, the step from 'finite-dimensional 
simple' to 'symmetrisable Kac-Moody' is rather easy: the only difference is that we drop 
the 'positive-definite' condition (which was responsible for finite-dimensionality). Kac- 
Moody (KM) algebras are also generated by (finitely many) Ax subalgebras, and their 
theory is quite parallel to that of the simple algebras. Compare Figures 6 and 9 in [59]. 

Now, it is easy to generalise something; the challenge is to generalise it in a rich and 
interesting direction. One natural and appealing strategy for generalisation was followed 
instinctively by a grad student named Robert Moody. Moody's original motivation for 
developing the theory of Kac-Moody algebras was the Weyl group. If there were Lie 
algebras for finite Coxeter groups, he asked, why not also for the Euclidean (=affine) 
ones? For another example of this style of generalisation, consider the question: What 
is the analogue of calculus (or manifolds) over weird fields — fields (like Zp) for which 
the usual limit definitions make no sense? This question leads to the riches of algebraic 
geometry. Nevertheless this generalisation strategy, even in the hands of a master, will 
not always be successful. For instance, consider all the trouble the following metaphor has 
caused: my watch has a maker, so so should the Universe. 

Recently Borcherds produced a further generalisation of finite-dimensional simple Lie 
algebras, which is rather less obvious than that of Kac-Moody algebras. It is easy to 
associate a Lie algebra to a matrix A, but which class of matrices will yield a deep theory? 
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Borcherds found such a class by holding in his hand a single algebra (the fake Monster 
Lie algebra, see §2.9) which acted a lot like a KM algebra, even though it had 'imaginary 
simple roots'. 

By a generalised symmetrised Cartan matrix A = (aij) we will mean a symmetric 
real matrix (possibly infinite), such that aij < if z 7^ j, and if an > then 2^^ G Z 
for all j. By a universal generalised Kac-Moody algebra (universal GKM) or universal 
Borcherds-Kac-Moody algebra g we mean the algebra"^"^ with generators e^, /i, /i^j, and with 
relations: [cifj] = hij] [hijCt] = SijaikCk] [hijfk] = Sijaikfk] if an > and i ^ j then 
ad(ei)"ej- = ad(/i)"/j = 0forn = l-2^; and if aij = then [aej] = [fifj] = 0. 

For example the Heisenberg algebra (1.2.3) corresponds to the choice A = (0), while 
any other 1 x 1 A = (a) corresponds to Ai. A universal GKM algebra differs from a 
KM algebra in that it is built up from Heisenberg algebras as well as Ai, and these 
subalgebras intertwine in more complicated ways. Nevertheless much of the theory for 
finite- dimensional simple Lie algebras continues to find an analogue in this much more 
general setting (e.g. root-space decomposition, Weyl group, character formula,...). This 
unexpected fact is the point of GKM algebras. 

To get a feel for these algebras, let us prove a few simple results concerning the hij. 
Note first that, using the above relations together with the Jacobi identity, we obtain 
[hijhki] = Sij{ajk —aji)hki- Comparing this with [hkihij] = —[hijhki], we see that bracket 
must always equal 0. Hence all /I's pairwise commute, and hij = unless the zth and jth 
columns of A are identical. An easy exercise now is to show that when i ^ j, hij will lie 
in the centre of the algebra (i.e. hij will commute with all other generators). 

Although the definition of universal GKM algebra is more natural, it turns out that 
an equivalent form can be more useful in practice. (It's simpler to describe over M, so 
in most expositions the reals are used, but alas it's far too late for us to switch loyalties 
now.) By a generalised Kac-Moody algebra (or Borcherds-Kac-Moody algebra) g, we mean 
a (complex!) Lie algebra which is: 

— Z-graded, i.e. g = ©igi, where [QiQj] C 0^+^-; 

— Qi is finite-dimensional for i 7^ 0; 

— g has an antilinear involution a; {i.e. uj{kx-'ry) = k*uj{x)+uj{y), [uj{x)uj{y)] = u{[xy]), 
and uj ouj = id.) which maps Qi to Q-i and acts as multiplication by —1 on some basis 
of 0o; 

— Q has an invariant symmetric bilinear form (•, •) (i.e. {[xy], z) = (x, [yz]) and (y, z) = 
{z,y) e C), obeying {uj{x),uj{y)) = (x,y)*, such that (0i,0j) = if i 7^ -j; 

— the Hermitian form defined by {x\y) := —{uj{x),y) is positive-definite on Qi^Q. 

Note that for some basis Xi of Qq, the third condition tells us —[xiXj] = [{—Xi){—Xj)] = 

[xiXj], i.e. 00 has a trivial bracket. It plays the role of the Cartan subalgebra f) in the theory. 

For example, let g = sl2(C) and recall (1.2.5). Then gi = Ce, go = C/i, g_i = C/ is 

the root-space decomposition. uj{x) = —x' is the Cartan involution, (x, y) = tr{xy) is the 
Killing form. 
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As with KM algebras, usually we Avant to extend it by some derivations; enough derivations are added so 
that the simple roots are linearly independent. 
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It turns out [5] that any universal GKM algebra is a GKM algebra, and any GKM 
algebra can be constructed from a unique universal GKM algebra (by quotienting out part 
of the centre and adding derivations), so in that sense the two structures are equivalent. 
This theorem is important, because it tells us that GKM algebras are the ultimate gener- 
alisation of simple Lie algebras, in the sense that any further generalisation will lose some 
basic structural ingredient. 

We know simple Lie algebras (and groups) arise in both classical and quantum physics, 
and the affine KM algebras are important in CFT, as we saw in Part 1. GKM algebras 
have recently appeared in the physics literature (see Harvey-Moore) in the context of BPS 
states in string theory. 

How do GKM algebras arise in VOAs? If we define [xy] := xoy, then as mentioned 
in §2.6 we get from the VGA Jacobi identity the equation [^[y^]] — [^[x^]] = [[xy]^]. Thus 
our bracket will be anti-associative if it is anti-commutative. But is it anti-commutative? 
It can be shown 



Ur,V 

I 
=0 






so uqv = —vqu if we look at things mod L^iV. 

Since our bracket is clearly bilinear, we thus get a Lie algebra structure on V/L-iV. 
Similarly, we get a symmetric bilinear product on V/L-iV, given by (w, v) := Uiv. 

We would like (^^r, •) to respect the Lie algebra structure, i.e. be [••] -invariant. We 
compute from (2.6.4) 

{[uvit) = -[v{u,t)] + {u,[vt]) . (2.7.2) 

Since we would like to identify (•, •) with the bilinear form in the GKM algebra definition, 
we also would like it to be number- valued (i.e. have 1-dimensional range). 

There is a simple way to satisfy both of these. First, restrict attention to Vi, i.e. the 
conformal weight 1 vectors: Vi fl (V/L-iV) = Vi/L-iVq. Then {u,v) G Vq. Assume Vq is 
1-dimensional: i.e. Vq = CI. Then {u,v) will equal a number times 1, so call {u,v) that 
number. Also, [{u,t)v] = {u,t)lov = 0, so {-k^-k) and hence (*,*), will be invariant. Of 
course, when Vq = CI, L_iVo — {0}. 

Vi/L_iVq is generally too large in practice to be useful; a subalgebra can be defined 
as follows. Let Pn be the 'primary states with conformal weight n', i.e. the u & Vn killed 
by Lm for all m > 0. Then g{V) := Pi/L-iPq will be a subalgebra of Vi/L-iVq. Through 
the assignment u i— > uq, q{V) acts on V and this action commutes with that of Li. This 
association of a Lie algebra to a VGA is due to Borcherds (1986). 

Similar arguments show that when Vq is 1-dimensional and Vi is 0-dimensional, then 
V2 will necessarily be a commutative nonassociative algebra with product uxv := uiv E V2 
and identity element ^uj {proof: iv x u = Lqu = 2u). Now, those conditions on Vq, Vi are 

satisfied by the Moonshine module V^. We find that V2 is none other than the Griess 
algebra extended by an identity element. 
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2.8. Ingredient #5: Denominator identities 

In §1.3, we discussed the representation theory of Lie algebras. An important invariant 
of a representation is its character. Simple Lie algebras possess a very useful formula for 
their characters, due to Weyl: 

V J^pW{\ + p)-Z 

ch,(^) := Y, dMLxip)) e^- = e""- ±--^^,, ^— , (2.8.1) 

^ lla6A+U e ) 

where W is the Weyl group, A_|_ the positive roots, and where ©^La(//) is the weight-space 
decomposition of Lx — i.e. the simultaneous eigenspaces of the hi. Here z belongs to the 
Cartan subalgebra [); the character is complex- valued. Analogous statements hold for all 
GKM algebras. 

It is rare indeed when a trivial special case of a theorem or formula produces something 
interesting. But that is what happens here. Consider the trivial representation: i.e. a; ^-* 
for all X G Xg. Then the character is identically 1, by definition: cho = 1. Thus the 
character formula tells us that a certain alternating sum over a Weyl group, equals a 
certain product over positive roots. These formulas, called denominator formulas, are 
nontrivial even in the finite-dimensional cases. 

Consider for instance the smallest simple algebra, Ai. Here the identity indeed is too 
trivial: it reads e^'^ — e~^'^ = e^'^(l — e~^). For A2 we get a sum of 6 terms equalling a 
product of 3 terms, and the complexity continues to rise from there. 

Around 1970 Macdonald tried to generalise these finite denominator identities to in- 
finite identities, corresponding to the extended Dynkin diagrams. These were later rein- 
terpreted by Kac, Moody and others as denominator identities for affine nontwisted KM 
algebras. The simplest one was known classically as the Jacobi triple product identity: 



J2 {-irx^\^= \{{l-x^mi-x^^-^y){l 



x'^-'y- 



We now know it to be the denominator identity for the simplest infinite-dimensional KM 
algebra, A\ ' . 

Freeman Dyson is famous for his work in quantum field theory, but he started as an 
undergraduate in number theory and still enjoys it as a hobby. Dyson [20] found a curious 
formula for the Ramanujan r-function, which can be defined by the generating function 
Yl^=i T~{n)x'^ = ?7^^(x) := x Y[m=i(^ " x"^)^^. Dyson found the remarkable formula 



Tin) = J2 



(a — 6) (a — c){a — d){a — e){b — c){b — d){b — e)(c — d){c — e){d — e) 

1! 2! 3! 4! 



where the sum is over all 5-tuples (a, 6, c, d, e) = (1, 2, 3, 4, 5) (mod 5) obeying a + b + c + 
d + e = and a'^ + b'^ + c"^ + d"^ + e^ = lOn. Using this, an analogous formula can be found 
for ?7^^. Dyson knew that similar- looking formulas were also known for rj'^ for the values 
rf = 3, 8, 10, 14, 15, 21, 24, 26, 28, 35, 36, . . .. 
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What was ironic was that Dyson found that formula at the same time that Macdonald 
was finding the Macdonald identities. Both were at Princeton then, and would often chat 
a little when they bumped into each other after dropping off their daughters at school. 
But they never discussed work. Dyson didn't realise that his strange list of numbers 
has a simple interpretation: they are precisely the dimensions of the simple Lie algebras! 
3 = dim(Ai), 8 = dim(A2), 10 = dim(C2), 14 = dim(G2), etc. In fact these formulas for 
1]'^ are none other than (specialisations of) the Macdonald identities. For example, Dyson's 
formula is the denominator formula for A4 (24 = dim(A4)). If they had spoken, they 
would probably have anticipated the afhne denominator identity interpretation. 

One curiousity apparently still has no algebraic interpretation: No simple Lie algebra 
has dimension 26, so the formula for 77^^ can't correspond to any Macdonald identity. 

Macdonald didn't close the book on denominator identities. More recently Kac and 
Wakimoto [40] have used denominator identities for Lie superalgebras to obtain nice formu- 
las for various generator functions involving sums of squares, sums of triangular numbers 
(triangular numbers are numbers of the form ^k{k + 1)), etc. For instance, the number of 
ways n can be written as a sum of 16 triangular numbers is 



3-43 



22 '^^ i.'^'^ ^ ^ 



2n2 



where the sum is over all odd positive integers a, 6, r, s obeying ar + hs = 2n + 4 and a > b. 
Another example of denominator identities is Borcherds' use of them in proving the 
Moonshine conjectures. In particular this motivated his introduction of the GKM algebras. 
The denominator identities for other GKM algebras were used by Borcherds to obtain 
results on the moduli spaces of e.g. families of K3 surfaces. They are also often turned- 
around now and used for learning about the positive roots in a given GKM. 



2.9. Proof of the Moonshine conjectures 

The main Conway-Norton conjecture was proved almost immediately. Thompson 
showed that if (7 1— > an{g) is a character for all n < 1200, then it will be for all n. He 
also showed that if certain congruence conditions hold for a certain number of an{g) (all 
with n < 100), then all g 1-^ o,n{g) will be virtual characters (i.e. a linear combination over 
Z of irreducible characters of M; only if all coefficients are nonnegative will it be a true 
character). Atkin-Fong-Smith [54] used that to prove on a computer that indeed all were 
virtual characters. But their work didn't say anything about the underlying (possibly 
virtual) representation V . The real challenge was to construct (preferably in a natural 
manner) the representation which works. Frenkel-Lepowsky-Meurman [23] constructed 
a candidate for it (the Moonshine module V^)] it was Borcherds who finally proved V'^ 
obeyed the Conway- Norton conjecture. A good overview of Borcherds' work on Moonshine 
is provided in [32]. 

We want to show that the McKay-Thompson series T'g(r) := q~^tYyi^[gq'"'^) of (2.1.4) 
equals the Hauptmodul Jg(r) in (2.4.1) (the 'fudge factor' q~^ = q~'^l'^'^ is familiar to 
e.g. KM algebras and CFT and was discussed at the end of §1.2). Borcherds' strategy 
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was to bring in Lie theory and to use the corresponding denominator identity to provide 
useful combinatorial data. The first guess for this 'Monster Lie algebra' was the Kac- 
Moody algebra whose Dynkin diagram is essentially the Leech lattice (i.e. a node for each 
vector in A24, and 2 nodes are connected by a number of edges depending on the value of a 
certain dot product). It was eventually discarded because some of the critical data (namely 
positive root multiplicities) needed in order to write down its denominator identity were 
too complicated. But looking at that failed attempt led Borcherds to a second candidate, 
now called the fake Monster Lie algebra g^. In order to construct it, he developed the 
theory of VOAs; and in order to understand it, he developed the theory of GKM algebras. 
We will define it shortly, g^ also turned out to be inadequate for proving the Moonshine 
conjectures; however it directly led him to the GKM algebra now called the true Monster 
Lie algebra Qm- And that directly led to the proof of Moonshine. 

Step 1: Construct 0m from V^ = Vq (BV^ (B ■ ■ ■■ For later convenience, reparametrise these 
subspaces V^ := V^_^i. Recall the even indefinite lattice //i,i defined in (1.6.1). Of course 

the direct choice 0(V^'') is 0-dimensional because V^ is trivial, so we must modify V'^ first. 

The Monster Lie algebra is (essentially) defined to be the Lie algebra g{V^ ® V//^^) 
associated to the vertex algebra V'' ® V//^ ^ (strictly speaking, more of q is quotiented 
away). By contrast, the fake Monster is the Lie algebra associated to the vertex algebra 
Va24 Viii^i — Vihs,!- Both of these are vertex algebras as opposed to VOAs, because of 
the presence of the indefinite lattices, but this isn't important here. 0m inherits a //i,i- 
grading from Vu^^ -^: the piece of grading (to, n) is isomorphic (as a vector space) to V^^, if 
(m, n) 7^ (0, 0); the (0,0) piece is isomorphic to M^. Borcherds uses the No-Ghost Theorem 
of string theory to show that the homogeneous pieces of 0m are those of V^. 

Both 0M and g^ are GKM algebras; for instance the Z-grading of 0m is given by 
(0m) fc = ©m+n^fcV^""" for /c 7^ 0, while the 0-part is V~^ ®V~^. Although 0^ is not 
used in the proof of the Monstrous Moonshine conjectures, it is related to some kind of 
Moonshine for the finite simple group .1, which is 'half of the automorphism group .0 of 
the Leech lattice A24. 0m corresponds to the Cartan matrix 

/-B_i^_i -B-1,1 -B-1,2 ■■■\ 
-Bi,-i -Bi,i -Bi,2 

i?2,-l -62,1 -^2,2 

V : ; ; ••./ 

where Bij for i,j G {—1, 1, 2, 3, . . .} is the a^ x aj block with fixed entry —i — j (the a^ as 
usual are the coefficients ^^ anQ^ of j — 744). 

2nd step: Compute the denominator identity of 0m: we get 

p-' n (1 - P'^q'^r- = J{z) - j{t) (2.9.1) 

m>0 

nex, 

where p = e^'^'^. The result are various formulas involving the coefficients a^, for instance 
0^4 = as + {ai — ai)/2. It turns out to be possible to 'twist' (2.9.1) by each (7 G M, obtaining 

p-'eM-Y.Yl a^^{g^f-^]=T,{z)-T,{r) . (2.9.2) 

kyo m>0 

raeZ 
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This looks a lot more complicated, but you can glimpse the Taylor expansion of ln{l—p'^q'^) 
there and in fact for g = id (2.9.2) reduces to (2.9.1). This formula gives more generally 
identities like a4{g) = a2{g) + {ai{g)'^ — ai{g'^)) /2, where Tgi^r) = ^^ ai{g)q\ These formu- 
las involving the McKay-Thompson coefficients are equivalent to the replication formulae 
conjectured in §2.4. 

3rd step: It was known earlier that all of the Hauptmoduls also obey those replication 
formula, and that anything obeying them will lie in a finite-dimensional manifold which 
we'll call R. In particular, if B(q) = q~^ + X]n>o ^nl"' and C{q) = q~^ + X]n>o '^^Q^' both 
lie in R, and bn — Cn for n < 23, then B{q) = C{q). In fact, it turns out that if we verify 
for each conjugacy class [g] of M that the first, second, third, fourth and sixth coefficients 
of the McKay-Thompson series Tg and the corresponding Hauptmodul Jg agree, then 
Tg = Jg, and we are done. 

That is precisely what Borcherds then did: he compared finitely many coefficients, and 
as they all equalled what they should, this concluded the proof of Monstrous Moonshine! 

However there was a disappointing side to his proof. While no one disputed its logical 
validity, it did seem to possess a disappointing conceptual gap. In particular, the Moonshine 
conjectures were made in the hope that proving them would help explain what the Monster 
had to do with the j-function and the other Hauptmoduls. A good proof says much more 
than 'True' or 'False'. The case- by-case verification occurred at the critical point where the 
McKay-Thompson series were being compared directly to the Hauptmoduls. The proof 
showed that indeed the Moonshine module establishes some sort of relation between Tg 
and Jg (namely, they must lie in the same finite-dimensional space), but why couldn't it 
be just a happy meaningless accident that they be equal? Of course we believe it's more 
than merely an accident, so our proof should refiect this: we want a more conceptual 
explanation. 

This conceptual gap has since been filled [17] — i.e. the case-by-case verification has 
been replaced with a general theorem. It turns out that something obeying the replicable 
formulas will also obey something called modular equations. A modular equation for a 
function / is a polynomial identity obeyed by f(x) and f{nx). The simplest examples come 
from the exponential and cosine functions: note that for any n > 0, exp(na;) = (exp(x))"^ 
and cos(na:) = T„(cos(x)) where T„ is a Tchebychev polynomial. A more interesting 
example of a modular equation is obeyed by J{t) = j{t) — 744: put X = J{t) and 
Y = J(2r), then 

{X^ - Y) (y2 -X)= 393768 {X^ + Y^) + 42987520 XY + 40491318744 {X + Y) 

- 120981708338256 . 

Finding modular equations (for various elliptic functions) was a passion of the great 
mathematician Ramanujan. His notebooks are filled with them. See e.g. [7] for an ap- 
plication of Ramanujan's modular equations to computing the first billion or so digits of 
n. Many modular equations are also studied in [10]. For more of their applications, see 
e.g. [16]. It can be shown that the only functions /(r) = q~^ + aiq + ■ ■ ■ which obey 
modular equations for all n, are J{t) and the 'modular fictions' q~^ and q~^ ± q (which 
are essentially exp, cos, and sin). 
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It was proved in [17] that, roughly speaking, a function B{t) = q~^ + J2n>o^nQ^ 
which obeys enough modular equations, will either be of the form B{t) = q~^ + biq, 
or will necessarily be a Hauptmodul for a modular group containing some ro(A^). The 
converse is also true: for instance, a modular equation for the Hauptmodul J25 of ro(25) 
given in (2.3.5) is 

{X^ - Y){Y^ -X) = -2 {X^ + y2) + 4 (X + y) - 4 , 

where X = J25(t) and Y = J25(2r). To eliminate the conceptual gap, this result should 
then replace step 3. Steps 1 and 2 are still required, however. 

This conceptual gap should not take away from what was a remarkable accomplish- 
ment by Borcherds: not only the proof of the Monstrous Moonshine conjectures, but also 
the definition of two new and important algebraic structures. I hope the preceding sec- 
tions give the reader some indication of why Borcherds was awarded one of the 1998 Fields 
Medals. 

Another approach to the Hauptmodul property is by Tuite [57], who related it to the 
(conjectured) uniqueness of VK Norton has suggested that the reason M is associated to 
genus-zero modular functions could be what he calls its '6-transposition' property [47]. 

So has Moonshine been explained? According to Conway, McKay, and many others, it 
hasn't. They consider VOAs in general, and V^ in particular, to be too complicated to be 
God-given. The progress, though impressive, has broadened not lessened the fundamental 
mystery, they would argue. 

For what it's worth, I don't completely agree. Explaining away a mystery is a little 
like grasping a bar of soap in a bathtub, or quenching a child's curiousity. Only extreme 
measures like pulling the plug, or enrollment in school, ever really work. True progress 
means displacing the mystery, usually from the particular to the general. Why is the sky 
blue? Because of how light scatters in gases. Why are Hauptmoduls attached to each 
g G M? Because of V^. Mystery exists wherever we can ask 'why' — like beauty it's in 
the beholder's eye. 

Moonshine is now 'leaving the nest'. We are entering a consolation phase, tidying up, 
generalising, simplifying, clarifying, working out more examples. Important and interesting 
discoveries will be made in the next few years, and yes there still is mystery, but no longer 
does a Moonshiner feel like an illicit distiller: Moonshine is now a day-job! 

Acknowledgments. 1 warmly thank the Feza Gursey Institute in Istanbul, and in particular 
Teoman Turgut, for their invitation to the workshop and hospitality during my month- long 
stay. These notes are based on 16 lectures I gave there in Summer 1998. I've also benefitted 
from numerous conversations with Y. Billig, A. Coste, C. Cummins, M. Gaberdiel, J. 
McKay, M. Tuite, and M. Walton — Mark Walton in particular made a very careful reading 
of the manuscript (and hence must share partial blame for any errors still remaining). My 
appreciation as well goes to J.-B. Zuber and P. Ruelle for sharing with me their personal 
stories behind the discoveries of, respectively, A-D-E in A\ and Fermat in A2 ■ The 
research was supported in part by NSERC. 



63 



References 

1. V. I. Arnold, Catastrophe Theory, 2nd edn. (Springer, Berlin, 1997); 

M. Hazewinkel, W. Hesselink, D. Siersma, and F.D. Veldkamp, Nieuw Arch. Wisk. 25 

(1977) 257; 

P. Slodowy, in: Lecture Notes in Math 1008, J. Dolgachev (ed.) (Springer, Berlin, 

1983). 

2. M. Bauer, A. Coste, C. Itzykson, and P. Ruelle, J. Geom. Phys. 22 (1997) 134. 

3. D. Bernard, Nud. Phys. B288 (1987) 628. 

4. J. Bockenhauer and D. E. Evans, Commun. Math. Phys. 200 (1999) 57; "Modular 
invariants, graphs and a-induction for nets of subfactors III", [hep-th/9812110 . 

5. R. E. Borcherds, Invent, math. 109 (1992) 405. 

6. R. E. Borcherds, "What is moonshine?" , |niath.QA/9809110| . 

7. J. M. Borwein, P. B. Borwein and D. H. Bailey, Amer. Math Monthly 96 (1989) 201. 

8. A. Cappelli, C. Itzykson, and J.-B. Zuber, Commun. Math. Phys. 113 (1987) 1. 

9. R. Carter, G. Segal and I. M. Macdonald, Lectures on Lie Groups and Lie algebras 
(Cambridge University Press, Cambridge, 1995). 

10. A. Cayley, An Elementary Treatise on Elliptic Functions, 2nd edn (Dover, New York, 
1961). 

11. J. H. Conway, Math. Intelligencer 2 (1980) 165. 

12. J. H. Conway and S. P. Norton, Bull. London Math. Soc. 11 (1979) 308. 

13. J. H. Conway and N. J. A. Sloane, Sphere packings, lattices and groups, 3rd edn 
(Springer, Berlin, 1999). 

14. A. Coste and T. Gannon, Phys. Lett. B323 (1994) 316. 

15. R. Courant and D. Hilbert, Methods of Mathematical Physics II (Wiley, New York, 
1989). 

16. D. Cox, Primes of the form x^ -\- ny'^, (Wiley, New York, 1989). 

17. C. J. Cummins and T. Gannon, Invent, math. 129 (1997) 413. 

18. P. Di Francesco, P. Mathieu and D. Senechal, Conformal Field Theory (Springer, New 
York, 1996). 

19. L. Dixon, P. Ginsparg, and J. Harvey, Commun. Math. Phys. 119 (1988) 221. 

20. F. Dyson, Bull. Amer. Math. Soc. 78 (1972) 635. 

21. F. J. Dyson, Math. Intelligencer 5 (1983) 47. 

22. D.E. Evans and Y. Kawahigashi, Quantum symmetries on operator algebras (Oxford 
University Press, Oxford, 1998). 

23. I. Frenkel, J. Lepowsky, and A. Meurman, Vertex operator algebras and the Monster 
(Academic Press, San Diego, 1988). 

24. J. Fuchs and C. Schweigert, Symmetries, Lie algebras, and representations (Cambridge 
University Press, Cambridge, 1997). 

25. W. Fulton and J. Harris, Representation Theory: A first course (Springer, New York, 
1996). 

26. M. R. Gaberdiel and P. Goddard, "Axiomatic conformal field theory" , |hep-th/9810019| . 

27. M. R. Gaberdiel and P. Goddard, "An introduction to meromorphic conformal field 
theory and its representations" , lecture notes in this volume. 

64 



28. D. Gaitsgory, "Notes on two dimensional conformal field theory and string theory". 
math.AG/9811061| . 



29. T. Gannon, "The Cappelli-Itzykson-Zuber A-D-E classification" , jmath.Q A/ 9902064 



30. K. Gawedzki, "Conformal field theory: a case study" , lecture notes in this volume. 

31. R. W. Gebert, Internat J. Mod. Phys. A8 (1993) 5441. 

32. P. Goddard, "The work of R.E. Borcherds", |math.QA/9808136| . 

33. D. Gorenstein, Finite Simple Groups (Plenum, New York, 1982). 

34. D. Gorenstein, R. Lyons, and R. Solomon, The Classification of the Finite Simple 
Groups (AMS, Providence, 1994). 

35. A. Hanany and Y.-H. He, "Non-abelian finite gauge theories", hep-th/9811183| . 



36. J. E. Humphreys, Introduction to Lie algebras and representation theory (Springer, 
New York, 1994). 

37. V. G. Kac, In: Lecture Notes in Math 848 (Springer, New York, 1981). 

38. V. G. Kac, Infinite Dimensional Lie algebras, 3rd edn (Cambridge University Press, 
Cambridge, 1990). 

39. V. G. Kac, Vertex Algebras for Beginners, 2nd edn (AMS, Providence, 1998). 

40. V. G. Kac and M. Wakimoto, In: Lie Theory and Geometry in Honor of Bertram 
Kostant, Progress in Math. 123 (Birkhauser, Boston, 1994). 

41. S. Kass, R. V. Moody, J. Patera and R. Slansky, Affine Lie algebras, weight multiplic- 
ities, and branching rules, Vol. 1 (University of California Press, Berkeley, 1990). 

42. S. Lang, Elliptic Functions, 2nd edn (New York, Springer, 1997). 

43. F. W. Lawvere and S. H. Schanuel, Conceptual Mathematics: A first introduction to 
Categories (Cambridge University Press, Cambridge, 1997). 

44. H. Mine, Nonnegative matrices (Wiley, New York, 1988). 

45. E.J. Mlawer, S.G. Nacuhch, H.A. Riggs, and H. J. Schnitzer, Nucl. Phys. B352 (1991) 
863. 

46. W. Nahm, Commun. Math. Phys. 118 (1988) 171. 

47. S. P. Norton, In: Proc. Symp. Pure Math. 47 (1987) 208. 

48. A. Ocneanu, "Paths on Coxeter diagrams: From Platonic solids and singularities 
to minimal models and subfactors" (Lectures given at Fields Institute (1995), notes 
recorded by S. Goto). 

49. L. Queen, Math, of Comput. 37 (1981) 547. 

50. A.N. Schellekens and S. Yankielowicz, Nucl. Phys. B327 (1989) 673. 

51. M. Schottenloher, A Mathematical Introduction to Conformal Field Theory (Springer, 
Berlin, 1997). 

52. G. Segal, In: IXth Proc. Int. Congress Math. Phys. (Hilger, 1989). 

53. S. Singh, Fermat's Enigma (Penguin Books, London, 1997). 

54. S. D. Smith, In: Finite Groups - Coming of Age, Contemp. Math. 193 (AMS, Provi- 
dence, 1996). 

55. W. Thurston, Three-dimensional geometry and topology, vol. 1 (Princeton, 1997). 

56. L. Toti Rigatelli, Evariste Galois (Birkhauser, Basel, 1996). 

57. M. Tuite, Commun. Math. Phys. 166 (1995) 495. 

58. V. G. Turaev, Quantum invariants of knots and 3-manifolds (de Gruyter, Berlin, 
1994). 

65 



59. M. A. Walton, "Affine Kac- Moody algebras and the Wess-Zumino-Witten model", 
lecture notes in this volume. 

60. M. Waldschmidt, P. Moussa, J.-M. Luck, and C. Itzykson (ed.). From Number Theory 
to Physics (Berlin, Springer, 1992). 

61. D. Zagier, Amer. Math. Monthly 97 (1990) 144. 

62. J.-B. Zuber, Commun. Math. Phys. 179 (1996) 265. 

63. J.-B. Zuber, private communication, February 1999. 



66 



